Llama 2 to run on your device without the internet by 2024

  • Qualcomm will run Meta's Llama 2 LLM on its Snapdragon chips in mobile devices and PCs by 2024.
  • On-device capability will remove the need for cloud computing to run large language models.
  • New chips will see LLMs run on mobile phones, laptops, and VR headsets without internet access.
Qualcomm and Meta partner to bring Llama 2 on-device

Meta is partnering with Qualcomm to enable Llama 2-based applications to run on its Snapdragon chips by 2024.

The current level of personal device computing power is not sufficient to run LLMs like Llama 2 effectively on mobile phones or laptops. If you want to use these powerful AI tools you generally need to run them on cloud computing servers.

When Meta announced the release of Llama 2 they partnered with Microsoft to use Azure as its preferred cloud computing solution. But the next generation of Qualcomm’s Snapdragon chips is anticipated to bring sufficient processing power to personal devices so that Llama 2-based AI applications can run on a personal device without the need for cloud computing. 

During an interview with Stability CEO earlier this year, the interviewer was incredulous when Emad Mostaque claimed that by 2024 we would have ChatGPT running on our phones without the internet. But now it seems that his bold claim wasn’t far-fetched at all.

Regarding its partnership with Meta, Qualcomm executive ​​Durga Malladi said, “We applaud Meta’s approach to open and responsible AI and are committed to driving innovation and reducing barriers to entry for developers of any size by bringing generative AI on-device.”

The current Snapdragon X75 chip already uses AI to improve network connectivity and location accuracy in 5G mobile phones. Once they get Llama 2 working on-device with the next-generation chip you’ll be able to chat with your AI app even when it’s on airplane mode or in an area with no coverage.

The other big advantage of this development is that it will drive the cost of AI applications down. Having to make API calls to an LLM running on a cloud server costs money and those costs are inevitably passed onto the users of the app.

If the app can interact with the LLM that is running on-device, then there are zero costs incurred. And even with the amazing speed of 5G networks, a locally run AI app like a voice assistant will respond even faster.

Running AI apps on-device without the need for sending data back and forth to cloud servers will also improve privacy and security concerns.

The Qualcomm Meta partnership in this project is a big deal and a sign of exciting developments to come. Way back in 2020, Apple was already bragging that the M1 chip used its Apple Neural Engine to speed up machine learning tasks.

Expect to see a lot more chip manufacturers like Nvidia working on having large AI models run on-device in the near future.

© 2023 Intelliquence Ltd. All Rights Reserved.

Privacy Policy | Terms and Conditions

×
 
 

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI


 

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2023 Guide to Enhanced Productivity'.



 
 

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions