Google, OpenAI, and Mistral have released new versions of their cutting-edge AI models within just 12 hours.
Meta is also set to join the party with its upcoming Llama 3 model, and OpenAI’s much-anticipated GPT-5 is in the pipeline.
What started as a highly niche category dominated by ChatGPT is now inundated with alternatives transcending Big Tech and the open and closed-source divide.
Google Gemini Pro 1.5
Google’s Gemini Pro 1.5 made the first splash, introducing advancements in long-context understanding that challenge Claude 3 Opus, which holds the aces in that category.
Our next-generation AI model Gemini 1.5 Pro is now available in public preview on @GoogleCloud’s #VertexAI platform.
Its long-context window is already helping businesses analyze large amounts of data, build AI-powered customer service agents & more. → https://t.co/CLMN3wNmeP pic.twitter.com/RpRVUul3eg
— Google DeepMind (@GoogleDeepMind) April 9, 2024
With the ability to process up to 1 million tokens, Gemini Pro 1.5 can handle vast amounts of information simultaneously, including 700,000 words, an hour of video, or 11 hours of audio.
Its Mixture-of-Experts (MoE) architecture enhances efficiency and performance by utilizing specialized models for specific tasks.
Google’s roster of Gemini models is quite complex, but this rates their most capable model for typical tasks.
Google is also letting developers make 50 free requests to the API daily, which one person on X estimated would cost up to $1,400.
Currently, Gemini 1.5 Pro is available in 180 countries.
New version of GPT-4 Turbo
OpenAI then released a new version, GPT-4 Turbo, with superior math and vision processing.
As per an X post, “GPT-4 Turbo with Vision is now generally available in the API. Vision requests can now also use JSON mode and function calling.”
GPT-4 Turbo with Vision is now generally available in the API. Vision requests can now also use JSON mode and function calling.https://t.co/cbvJjij3uL
Below are some great ways developers are building with vision. Drop yours in a reply 🧵
— OpenAI Developers (@OpenAIDevs) April 9, 2024
OpenAI anticipates releasing GPT -5 soon, as well as its text-to-video model Sora, which has no notable competitors right now (though that will change).
Mixtral 8x22B
However, perhaps the biggest surprise came from Mistral, who boldly published their Mixtral 8x22B model as a freely downloadable 281GB file via torrent.
magnet:?xt=urn:btih:9238b09245d0d8cd915be09927769d5f7584c1c9&dn=mixtral-8x22b&tr=udp%3A%2F%https://t.co/2UepcMGLGd%3A1337%2Fannounce&tr=http%3A%2F%https://t.co/OdtBUsbeV5%3A1337%2Fannounce
— Mistral AI (@MistralAI) April 10, 2024
With an impressive 176 billion parameters and a context length of 65,000 tokens, this open-source model on the Apache 2.0 license is expected to outperform Mistral’s previous Mixtral 8x7B model, which had already surpassed competitors like Llama 2 70B in various benchmarks.
Mixtral 8x22B’s advanced MoE architecture enables efficient computation and improved performance versus previous iterations.
Meta Llama 3 is incoming
Not to be left behind, reports suggest that Meta could release a small version of its highly anticipated Llama 3 model as early as next week, with the full open-source model still slated for July.
Llama 3 is expected to come in various sizes, from very small models competing with Claude Haiku or Gemini Nano to larger, fully responsive, and reasoning-capable models rivaling GPT-4 or Claude 3 Opus.
Model multiplication
A generative AI ecosystem once dominated by ChatGPT is now flooded by alternatives.
Virtually every major tech company is involved, either directly or through sizeable investments. And with each player joining the fray, any hope for one faction to dominate the market is dwindling.
We’re now also seeing the gap close between closed-source models from OpenAI, Anthropic, Google, etc, and closed-source alternatives from Mistral, Meta, and others.
Open-source models are still quite inaccessible to the wider population, but this, too, is likely to change.
So, do any of these models represent genuine progress in machine learning, or just more of the same but better? It depends on who you ask.
Some, like Elon Musk, predict AI will exceed human intelligence within a year.
Others, like Meta chief scientist Yann LeCun, argue that AI is miles behind us on any robust measures of intelligence.
LeCun explained in February about current LLMs, “So basically, they can’t invent new things. They’re going to regurgitate approximately whatever they were trained on from public data, which means you can get it from Google. People have been saying, ‘Oh my God, we need to regulate LLMs because they’re gonna be so dangerous.’ That is just not true.”
Meta aims to create ‘object-driven’ AI that more truly understands the world and attempts to plan and reason around it.
“We are hard at work in figuring out how to get these models not just to talk but actually to reason, to plan . . . to have memory,” explained Joelle Pineau, the vice president of AI research at Meta.
OpenAI’s chief operating officer, Brad Lightcap, also said his company is focusing on improving the AI’s ability to reason and handle more complex tasks.
“We’re going to start to see AI that can take on more complex tasks in a more sophisticated way,” he said at a recent event, “I think over time… we’ll see the models go towards longer, kind of more complex tasks, and that implicitly requires the improvement in their ability to reason.”
As 2024 heads towards summer, the AI community and society at large will be watching closely to see what groundbreaking developments emerge from the laboratories of these tech giants.
It’s going to be quite a colorful selection by the end of the year.