Qualcomm, Meta to bring Llama 2 AI to flagship phones in 2024

Qualcomm and Meta are working to make Meta’s large language model. Llama 2, available on devices like smartphones, PCs, VR/AR headsets and vehicles.

The implementation on these devices will allow customers, partners and developers to build use cases, such as intelligent virtual assistants, that can work in areas with no connectivity or even in airplane mode, according to Qualcomm.

While acknowledging that AI will still need to run on both the cloud and devices at the edge, Ignacio Contreras, senior director of marketing at Qualcomm, cited several advantages of running AI on the device itself.

The first one has to do with money. “If you are a developer or if you are a service provider, you need to pay someone to run this on the cloud,” he said. Each time a query is made, that costs money.  

Costs for the cloud providers themselves also increase exponentially as many more users make AI web queries, he said. In fact, the combination of additional infrastructure, energy, compute power and more may cause the AI query to cost up to 10 times more than a traditional search, he said.

Contreras said there are also privacy implications. When using a cloud-based approach, every query means information is getting shared in the broader ecosystem. Some jurisdictions prohibit digital data from leaving their borders. If the processing stays on the device, there’s no reason to worry that it’s going somewhere it’s not supposed to go.

Personalization is another aspect. Right now, if you ask any of these large language models the same question, you get pretty much the same answer. But if there’s location or contextual information that is available with a device, users can get a much more personalized answer.

“Definitely there are strong benefits of running more things on the device itself,” he said.

Stable Diffusion, ControlNet demos

Qualcomm certainly is no stranger to AI. Earlier this year, Qualcomm announced the world’s first on-device demonstration of Stable Diffusion, a text-to-image generative AI model capable of generating photorealistic images given any text input. It had been primarily confined to running in the cloud.

However, Qualcomm AI Research performed full-stack AI optimizations using the Qualcomm AI Stack to deploy Stable Diffusion on an Android smartphone. More recently, Qualcomm demonstrated ControlNet running entirely on a phone as well. In that demo, AI images were generated on the mobile device in under 12 seconds without requiring any cloud access. 

According to Qualcomm, it has a longstanding history of working with Meta to drive technology innovation. The companies’ current collaboration with the Llama ecosystem spans across research and product engineering efforts.

Qualcomm expects to make available Llama 2-based AI implementation on devices powered by Snapdragon starting in 2024.

While it might sound like Qualcomm is betting all of its AI chips on phones, it’s also using AI in the infrastructure side of its business. Last year, the company introduced the Qualcomm Edgewise suite for accelerating 5G rollouts and open RAN adoption. That was tied to its Cellwize Wireless Technologies acquisition.