Bling is in the air. On September 9th Apple released its latest iPhone 16 series at an event called “It’s Glowtime”. The name referred to the sheen around Siri, its souped-up voice assistant. But it was just as appropriate for the new colour of its snazziest iPhone 16 Pro model: “desert titanium”—in other words, gold.
A bit lacking, though, was zing. Tim Cook, the company’s chief executive, played up the promise of the phones’ generative artificial-intelligence (AI) features, which he trailed with much hoopla in June under the moniker “Apple Intelligence”. But though the devices come with Apple’s new superfast A18 chips to power AI, iPhone buyers will have to wait until at the earliest October for the first features—and only then in beta. The demos look ho-hum. If you point the camera at a restaurant, Apple Intelligence can tell you what’s on the menu. You can type a request to Siri, as well as ask it questions. Investors hope that eventually more conversational and personalised AI features will reboot iPhone sales, which account for about half of Apple’s revenues but have sagged lately. They could be waiting a while.
Apple is one of many firms that want to take generative AI beyond giant data centres, known as the cloud, and run it on smaller devices, known as the edge. Samsung, Apple’s Android-based smartphone rival, got a head start, launching its Galaxy S24 with some generative-AI features earlier this year. So did Microsoft, which has launched Windows PCs designed for AI, called Copilot+. But by and large the market is still up for grabs. Cracking it will not be easy.
Today large language models (LLMs) are mostly trained on graphics processing units (GPUs) that require so much energy it can take a nuclear-power plant to fuel them. They also need huge amounts of memory and unfathomable quantities of data. All that can cost hundreds of millions of dollars.
Even once they are trained, running these mega-models is expensive. According to one estimate, it costs OpenAI, the maker of ChatGPT, 36 cents every time someone asks its bot a question. Edge devices instead deploy smaller models, distilled from their cloud-based big brothers. These are cheaper, and also faster. The goal is to reach such low levels of latency that response times feel almost human. Edge AI can also learn about a user from their interactions with their device (Apple calls this “semantic indexing”). For instance, it will know if a user’s mother is calling and act appropriately.
In practice, however, shifting AI to the edge is not straightforward. One problem is performance. Complex queries, such as using an AI bot to plan a holiday, will still require cleverer cloud-based LLMs. Another problem is computational power. Even smaller AI models require oodles of it to run, quickly draining a device’s batteries.
Companies are experimenting with various solutions to these problems. Apple Intelligence will offer on-device AI as a first port of call, but send trickier queries to the firm’s private cloud. The service will direct the most idiosyncratic requests to third-party LLMs such as ChatGPT. Apple promises to do so only with the user’s permission, but the approach could still worry the privacy-conscious. Devices, especially smartphones, have access to vast amounts of users’ personal data: whom they call, where they live, what they spend, what they look like. Some may prefer that if generative AI tools use any of that information, it remains on-device.
Tech firms are also making use of alternatives to GPUs that are less energy-hungry, such as neural processing units (NPUs), to run AI models on the edge. Qualcomm, which makes NPUs and various other chips for edge devices, talks about maximising “performance per watt”. Compared with GPUs, whose costs can be stratospheric, NPUs are also cheaper. No one, after all, wants a phone that costs as much as a data centre.
Plenty of firms have an interest in shifting AI to devices. Cloud-based LLMs are heavily dependent on Nvidia, the leading maker of GPUs. When it comes to edge AI, by contrast, “there’s nobody that dominates,” says Taner Ozcelik, a former Nvidia executive who now runs Mythic, a startup making energy-efficient chips for AI devices.
Although no single firm may gain as much from edge AI as Nvidia has from the cloud variety, there would still be big winners, says Neil Shah of Counterpoint, a research firm. Making the technology work could not only trigger a supercycle in device sales, but also create new opportunities for apps and digital advertising. For the moment, though, edge AI is barely ready for showtime, let alone Glowtime. ■