Microsoft launches Phi-3 Mini, a tiny but powerful LM

April 24, 2024
  • Microsoft released Phi-3 Mini, the first in its new family of small language models (SLM)
  • Phi-3 Mini consists of only 3.8B parameters but performs better than models twice its size
  • Microsoft says SLMs are ideal solutions for specific functions, lower costs, and offline use

Microsoft launched Phi-3 Mini, a tiny language model that is part of the company’s strategy to develop lightweight, function-specific AI models.

The progression of language models has seen ever larger parameters, training datasets, and context windows. Scaling the size of these models delivered more powerful capabilities but at a cost.

The traditional approach to training an LLM is to have it consume massive amounts of data which requires huge computing resources. Training an LLM like GPT-4, for example, is estimated to have taken around 3 months and to have cost over $21m.

GPT-4 is a great solution for tasks that require complex reasoning but overkill for simpler tasks like content creation or a sales chatbot. It’s like using a Swiss Army knife when all you need is a simple letter opener.

At only 3.8B parameters, Phi-3 Mini is tiny. Still, Microsoft says it is an ideal lightweight, low-cost solution for tasks like summarizing a document, extracting insights from reports, and writing product descriptions or social media posts.

The MMLU benchmark figures show Phi-3 Mini and the yet-to-be-released larger Phi models beating larger models like Mistral 7B and Gemma 7B.

Phi-3 models’ performance on the Massive Multitask Language Understanding (MMLU) benchmark compared to other models of similar size. Source: Microsoft

Microsoft says Phi-3-small (7B parameters) and Phi-3-medium (14B parameters) will be available in the Azure AI Model Catalog “shortly”.

Larger models like GPT-4 are still the gold standard and we can probably expect that GPT-5 will be even bigger.

SLMs like Phi-3 Mini offer some important benefits that larger models don’t. SLMs are cheaper to fine-tune, require less compute, and could run on-device even in situations where no internet access is available.

Deploying an SLM at the edge results in less latency and maximum privacy because there’s no need to send data back and forth to the cloud.

Here’s Sebastien Bubeck, VP of GenAI research at Microsoft AI with a demo of Phi-3 Mini. It’s super fast and impressive for such a small model.

Curated synthetic data

Phi-3 Mini is a result of discarding the idea that huge amounts of data are the only way to train a model.

Sebastien Bubeck, Microsoft vice president of generative AI research asked “Instead of training on just raw web data, why don’t you look for data which is of extremely high quality?”

Microsoft Research machine learning expert Ronen Eldan was reading bedtime stories to his daughter when he wondered if a language model could learn using only words a 4-year-old could understand.

This led to an experiment where they created a dataset starting with 3,000 words. Using only this limited vocabulary they prompted an LLM to create millions of short children’s stories which were compiled into a dataset called TinyStories.

The researchers then used TinyStories to train an extremely small 10M parameter model which was subsequently able to generate “fluent narratives with perfect grammar.”

They continued to iterate and scale this synthetic data generation approach to create more advanced, but carefully curated and filtered synthetic datasets that were eventually used to train Phi-3 Mini.

The result is a tiny model that will be more affordable to run while offering performance comparable to GPT-3.5.

Smaller but more capable models will see companies move away from simply defaulting to large LLMs like GPT-4. We could also soon see solutions where an LLM handles the heavy lifting but delegates simpler tasks to lightweight models.

Join The Future


Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Eugene van der Watt

Eugene comes from an electronic engineering background and loves all things tech. When he takes a break from consuming AI news you'll find him at the snooker table.


Stay Ahead with DailyAI


Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.


*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions