LLMs On IPhones: Explained: Apple’s new method for running LLMs on iPhones

Apple GPT running on iPhones may soon become a reality. AI researchers at the Cupertino-based tech giant have reportedly made a key breakthrough in deploying large language models (LLMs) on iPhones and other Apple devices. Apple’s researchers have said that this can be achieved with limited memory by inventing a new flash memory utilisation technique.
LLMs hunger for data and memory
LLM-based chatbots like ChatGPT and Claude are very data and memory-intensive. These models typically require major amounts of memory to work. Such requirements can be a challenge for devices like iPhones that have limited memory capacity.
To tackle this issue, Apple researchers have developed a new technique that uses flash memory to store the AI model’s data. This is the same memory where apps and photos are also stored.
How Apple is planning to run LLMs on iPhones
In a new research paper titled “LLM in a flash: Efficient Large Language Model Inference with Limited Memory” (spotted first by MacRumors), the authors have claimed that flash storage is more abundant in mobile devices than the RAM traditionally used for running LLMs. Their method bypasses the limitation using two key techniques that minimises data transfer and maximise flash memory throughput. These methods are:
Windowing: This is like a recycling method. Instead of loading new data every time, the AI model will reuse some of the data it has already processed. This reduces the requirement for constant memory fetching and makes the process faster and smoother.
Row-Column Bundling: This technique is similar to reading a book in larger chunks instead of one word at a time. It can group data more efficiently that can be read faster from the flash memory. This method also speeds up the AI’s ability to understand and generate language.
The paper suggests that the combination of these methods will allow AI models to run up to twice the size of the iPhone‘s available memory. This method is expected to increase speed on standard processors (CPUs) by 4-5 times and 20-25 times faster on graphics processors (GPUs).
The authors note: “This breakthrough is particularly crucial for deploying advanced LLMs in resource-limited environments, thereby expanding their applicability and accessibility.”
How this method will improve AI features on iPhones
The latest breakthrough in AI efficiency will open up new possibilities for future iPhones. This includes more advanced Siri capabilities, real-time language translation and other AI-driven features in photography and augmented reality. The technology will also help iPhones to run complex AI assistants and chatbots on-device which Apple is already said to be working on.
In February, Apple held an AI summit and briefed employees on its large language model. Eventually, Apple’s work on generative AI may be used into its ‌Siri‌ voice assistant.

Apple is developing a smarter version of Siri that’s deeply integrated with AI, reports Bloomberg. The company is planning to update the way ‌Siri‌ interacts with the Messages app. This allows users to field complex questions and auto-complete sentences more effectively. Moreover, Apple is also reportedly planning to add AI to as many apps as possible.
The iPhone maker is also reportedly developing its own generative AI model called “Ajax”. Ajax operates on 200 billion parameters which suggests a high level of complexity and capability in language understanding and generation.
Internally known as “Apple GPT,” Ajax is aimed at unifying machine learning development across the company. This suggests a broader strategy of the company to integrate AI more deeply into Apple’s ecosystem.
Rumours also suggest that Apple may include some kind of generative AI feature with iOS 18 which will be available on the ‌iPhone‌ and iPad around late 2024. In October, analyst Jeff Pu said that Apple is building a few hundred AI servers in 2023 and more are expected to arrive by 2024. Apple is likely to offer a combination of cloud-based AI and AI with on-device processing.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Swift Telecast is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – swifttelecast.com. The content will be deleted within 24 hours.

Leave a Comment