Meta is partnering with Qualcomm to enable Llama 2-based applications to run on its Snapdragon chips by 2024.
The current level of personal device computing power is not sufficient to run LLMs like Llama 2 effectively on mobile phones or laptops. If you want to use these powerful AI tools you generally need to run them on cloud computing servers.
When Meta announced the release of Llama 2 they partnered with Microsoft to use Azure as its preferred cloud computing solution. But the next generation of Qualcomm’s Snapdragon chips is anticipated to bring sufficient processing power to personal devices so that Llama 2-based AI applications can run on a personal device without the need for cloud computing.
During an interview with Stability CEO earlier this year, the interviewer was incredulous when Emad Mostaque claimed that by 2024 we would have ChatGPT running on our phones without the internet. But now it seems that his bold claim wasn’t far-fetched at all.
Regarding its partnership with Meta, Qualcomm executive Durga Malladi said, “We applaud Meta’s approach to open and responsible AI and are committed to driving innovation and reducing barriers to entry for developers of any size by bringing generative AI on-device.”
The current Snapdragon X75 chip already uses AI to improve network connectivity and location accuracy in 5G mobile phones. Once they get Llama 2 working on-device with the next-generation chip you’ll be able to chat with your AI app even when it’s on airplane mode or in an area with no coverage.
The other big advantage of this development is that it will drive the cost of AI applications down. Having to make API calls to an LLM running on a cloud server costs money and those costs are inevitably passed onto the users of the app.
If the app can interact with the LLM that is running on-device, then there are zero costs incurred. And even with the amazing speed of 5G networks, a locally run AI app like a voice assistant will respond even faster.
Running AI apps on-device without the need for sending data back and forth to cloud servers will also improve privacy and security concerns.
The Qualcomm Meta partnership in this project is a big deal and a sign of exciting developments to come. Way back in 2020, Apple was already bragging that the M1 chip used its Apple Neural Engine to speed up machine learning tasks.
Expect to see a lot more chip manufacturers like Nvidia working on having large AI models run on-device in the near future.