Microsoft reportedly building a 500B LLM called MAI-1

According to a report by The Information, Microsoft is working on a 500B parameter LLM called MAI-1 that could take on GPT-4 and Google’s Gemini models.

We recently reported on Microsoft’s Phi-3 Mini family of small language models ranging from 3.8B to 14B parameters. At 500B parameters, MAI-1 is set to be the largest model Microsoft has deployed.

Its size puts it in the same ballpark as GPT-4 and Google’s bigger Gemini models. GPT-4 is rumored to have 1.76T parameters but it’s a Mixture of Experts (MoE) model so only around 280B parameters are in play during inference.

There isn’t any information available about the architecture of MAI-1, but if it’s a dense model, as opposed to MoE, then it’s going to be pretty powerful. Meta’s anticipated Llama 3 model is expected to have 400B parameters.

The development of MAI-1 is being led by Mustafa Suleyman, co-founder and former head of applied AI at DeepMind.

Mustafa left DeepMind to co-found AI startup Inflection in 2022. In March this year, Microsoft hired the majority of the Inflection’s staff and paid $650 million for the rights to the company’s IP.

MAI-1 is apparently a completely new Microsoft project rather than a continuation of an existing Inflection project. There’s no word on a release date but we might get to see a preview of MAI-1 on 16 May at Microsoft’s Build developer conference.

Microsoft is OpenAI’s biggest investor so the fact that it’s developing its own LLMs to rival those of OpenAI is a little surprising to some. Is Microsoft hedging its bets, pursuing multiple development strategies, or something else entirely?

Microsoft’s CTO Kevin Scott tried to downplay the issue. In a post on LinkedIn Scott said, “I’m not sure why this is news, but just to summarize the obvious: we build big supercomputers to train AI models; our partner Open AI uses these supercomputers to train frontier-defining models; and then we both make these models available in products and services so that lots of people can benefit from them. We rather like this arrangement.”

Scott may be sincere in this statement, but when MAI-1 is released it could put Microsoft squarely in competition with the company in which it has invested billions of dollars.

Will MAI-1 be released just in time for OpenAI to upstage it by releasing GPT-5? OpenAI scheduled an event for this Thursday where the company was expected to share updates and product demonstrations but the event has since been postponed.

With mysterious GPT-2 chatbots appearing, disappearing, and now reappearing, Microsoft building huge models, and OpenAI keeping us guessing, the AI drama is relentless.

Microsoft reportedly building a 500B LLM called MAI-1

Join The Future

Eugene van der Watt

RELATED POSTS

OpenAI announces “SearchGPT” to try and stay at the front of the pack

Meta releases Llama 3.1 models, sticks with open strategy

Senate probes OpenAI’s safety and governance after whistleblower claims

Google’s AI predicts weather using fraction of computing power

Microsoft reportedly building a 500B LLM called MAI-1

Join The Future

Eugene van der Watt

RELATED POSTS

OpenAI announces “SearchGPT” to try and stay at the front of the pack

Meta releases Llama 3.1 models, sticks with open strategy

Senate probes OpenAI’s safety and governance after whistleblower claims

Google’s AI predicts weather using fraction of computing power

FREE PDF EXCLUSIVEStay Ahead with DailyAI

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI