Microsoft reportedly building a 500B LLM called MAI-1

May 7, 2024

  • Microsoft is reportedly building a 500B parameter model called MAI-1
  • The project is being led by ex-Google AI and former Inflection CEO Mustafa Suleyman
  • MAI-1 will be much bigger than other Microsoft models including its new Phi-3 Mini model family

According to a report by The Information, Microsoft is working on a 500B parameter LLM called MAI-1 that could take on GPT-4 and Google’s Gemini models.

We recently reported on Microsoft’s Phi-3 Mini family of small language models ranging from 3.8B to 14B parameters. At 500B parameters, MAI-1 is set to be the largest model Microsoft has deployed.

Its size puts it in the same ballpark as GPT-4 and Google’s bigger Gemini models. GPT-4 is rumored to have 1.76T parameters but it’s a Mixture of Experts (MoE) model so only around 280B parameters are in play during inference.

There isn’t any information available about the architecture of MAI-1, but if it’s a dense model, as opposed to MoE, then it’s going to be pretty powerful. Meta’s anticipated Llama 3 model is expected to have 400B parameters.

The development of MAI-1 is being led by Mustafa Suleyman, co-founder and former head of applied AI at DeepMind.

Mustafa left DeepMind to co-found AI startup Inflection in 2022. In March this year, Microsoft hired the majority of the Inflection’s staff and paid $650 million for the rights to the company’s IP.

MAI-1 is apparently a completely new Microsoft project rather than a continuation of an existing Inflection project. There’s no word on a release date but we might get to see a preview of MAI-1 on 16 May at Microsoft’s Build developer conference.

Microsoft is OpenAI’s biggest investor so the fact that it’s developing its own LLMs to rival those of OpenAI is a little surprising to some. Is Microsoft hedging its bets, pursuing multiple development strategies, or something else entirely?

Microsoft’s CTO Kevin Scott tried to downplay the issue. In a post on LinkedIn Scott said, “I’m not sure why this is news, but just to summarize the obvious: we build big supercomputers to train AI models; our partner Open AI uses these supercomputers to train frontier-defining models; and then we both make these models available in products and services so that lots of people can benefit from them. We rather like this arrangement.”

Scott may be sincere in this statement, but when MAI-1 is released it could put Microsoft squarely in competition with the company in which it has invested billions of dollars.

Will MAI-1 be released just in time for OpenAI to upstage it by releasing GPT-5? OpenAI scheduled an event for this Thursday where the company was expected to share updates and product demonstrations but the event has since been postponed.

With mysterious GPT-2 chatbots appearing, disappearing, and now reappearing, Microsoft building huge models, and OpenAI keeping us guessing, the AI drama is relentless.

Join The Future


SUBSCRIBE TODAY

Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Eugene van der Watt

Eugene comes from an electronic engineering background and loves all things tech. When he takes a break from consuming AI news you'll find him at the snooker table.

×

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions