Meta drops two versions of the Llama 3 model with a third imminent

April 18, 2024
  • Meta made two versions of their Llama 3 model widely available
  • They've released 8 and 70 billion parameter versions thus far
  • A 400 billion version and multi-modal versions are still being trained
llama 3

Meta has released the highly anticipated Llama 3 series, with the first two models, Llama 3-8B and Llama 3-70B, now widely available.

Days ago, at an event in London, Meta executives Nick Clegg and Yann LeCun said Llama 3 was imminent this month. 

The first two versions dropped today, marking the third and fourth major open models to be released this month after xAI’s Grok-1.5V and Mistral’s 8x22B.

Llama 3 is pre-trained on an impressive 15 trillion tokens, a 7-fold increase compared to Llama 2. The pretraining data also includes four times more code.

Under the hood, Llama 3 introduces architectural improvements such as a more efficient tokenizer with a larger vocabulary of 128K tokens.

Here’s a quick rundown of Llama 3’s performance:

Performance of Llama 3 8B:

  • Outperforms models like Mistral’s 7B and Google’s Gemma 7B in several benchmarks.
  • Excels in MMLU, ARC, DROP, GPQA (primarily science-based questions), HumanEval (code generation), GSM-8K (math problems), MATH (math benchmark), AGIEval (problem-solving), and BIG-Bench Hard (commonsense reasoning).

70B comparison with other models:

  • Llama 3 70B is competitive with top AI models like Google’s Gemini 1.5 Pro.
  • Beats Gemini 1.5 Pro in MMLU, HumanEval, and GSM-8K.
  • Performs better than Anthropic’s Claude 3 Sonnet (the middle tier of its Claude 3 series) on five benchmarks: MMLU, GPQA, HumanEval, GSM-8K, and MATH.
Llama 3 8B and 70B benchmarks. Source: Meta
Llama 8B and 70B benchmarks. Source: Meta

Those are excellent scores for an open model (although Meta’s license does have some limitations).

It makes Llama 3 the new top-performing open-source (sort of) free model.

Llama 3 will also be more palatable and less stubborn to use – fewer non-responses and higher accuracy for trivia questions, historical facts, and STEM-related queries.

Llama 3 is poised to become widely available across major platforms, including cloud services and API providers.

Meta is already working to expand Llama 3 to 400 billion parameters and add new functions like multimodality, multilingual support, and extended contextual understanding.

Meta’s rogue role in generative AI

In many ways, Meta has emerged as the rebel of the generative AI industry.

Meta Chief AI Scientist Yann LeCun, one of AI’s most well-respected figureheads, holds what some construe as dissenting views about AI’s direction views that criticize closed-source projects at Meta’s Big Tech competitors.

Meanwhile, ex-UK Deputy Prime Minister Nick Clegg, the head of Global Affairs, has been called out for some at-times laissez-faire views about Meta’s AI products, which may not surprise any Brits out there.

Last week, Clegg seemed to play down AI’s impacts on electioneering and deep fake manipulation. A view that very much counters the prevailing narrative that deep fakes could be (or already are) profoundly destructive.

As a matter of fact, Meta’s Oversight Board is actively investigating two cases of deep fake pornography right now. The Board deemed that Meta’s content moderation actions were too slow.

Meta has also been bullish about the improving quality of its models. Joelle Pineau, Meta’s vice president of AI research, said, “In many ways, the models that we have today are going to be child’s play compared to the models coming in five years.”

Pineau also warned, “If we keep on growing our model ever more in general and powerful without properly socializing them, we are going to have a big problem on our hands.” 

Llama 3’s release also comes as Meta’s AI Facebook agents cause a commotion across social media.

In a Facebook group for New York City parents, a Meta AI assistant – designed to provide advice and answer questions – shocked people by claiming to have a “gifted and disabled child” attending a specific school for the “gifted and talented.”

When confronted by the group members, the AI admitted, “I’m just a large language model, I don’t have personal experiences or children,” in what some labeled a Black Mirror-esque incident.

Llama 3, Grok-1.5, and Mistral’s models shift more power towards open-sourced communities while further diluting the generative AI market.

But that might be a good thing, as it’s survival of the fittest now, and the ball is firmly in the Microsoft-OpenAI camp, which is anticipated to make the next move in this fascinating game of gen-AI chess.

Join The Future


SUBSCRIBE TODAY

Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Sam Jeans

Sam is a science and technology writer who has worked in various AI startups. When he’s not writing, he can be found reading medical journals or digging through boxes of vinyl records.

×
 
 

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI


 

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.



 
 

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions