Google announced three new models in its Gemini family, making them available as an experimental release to gather feedback from developers.
The release is a continuation of Google’s iterative approach instead of jumping straight to Gemini 2.0. The experimental models are improved versions of Gemini 1.5 Pro and Gemini 1.5 Flash as well as a new smaller Gemini 1.5 Flash-8B.
Google’s Product Lead, Logan Kilpatrick, said Google’s release of experimental models “to gather feedback and get our latest updates into the hands of developers. What we learn from experimental launches informs how we release models more widely.”
Google says the upgraded Gemini 1.5 Pro is a significant improvement on the previous version, with improved coding capabilities and complex prompt handling. Gemini 1.5 models are built to handle extremely long contexts and can recall and reason over fine-grained information from up to at least 10M tokens. The experimental models have a 1M token limit though.
Gemini 1.5 Flash is the cheaper, low-latency model designed to handle high-volume tasks and long-context summarization of multimodal inputs. Initial testing of the experimental releases saw the improved Pro and Flash models climbing the LMSYS leaderboard.
Chatbot Arena update⚡!
The latest Gemini (Pro/Flash/Flash-9b) results are now live, with over 20K community votes!
Highlights:
– New Gemini-1.5-Flash (0827) makes a huge leap, climbing from #23 to #6 overall!
– New Gemini-1.5-Pro (0827) shows strong gains in coding, math over… https://t.co/6j6EiSyy41 pic.twitter.com/D3XpU0Xiw2— lmsys.org (@lmsysorg) August 27, 2024
Gemini Flash 8B
When Google released the Gemini 1.5 technical report earlier this month, it showcased some of the Google DeepMind team’s early work on an even smaller 8 billion parameter variant of the Gemini 1.5 Flash model.
The multimodal Gemini 1.5 Flash-8B experimental model is now available for testing. Benchmark tests show it beating Google’s lightweight Gemma 2-9B model and Meta’s significantly larger Llama 3-70B.
The idea behind Gemini 1.5 Flash-8B is to have an extremely fast and very cheap model that still has multimodal abilities. Google says it “can power intelligent agents deployed at scale, facilitating real-time interactions with a large user base.” Flash-8B is “intended for everything from high volume multimodal use cases to long context summarization tasks.”
Developers looking for a lightweight, cheap, and fast multimodal model with a 1M token context will likely be more excited by Gemini Flash-8B than the improved Flash and Pro models. Those looking for more advanced models will be wondering when we could expect Google to release Gemini 1.5 Ultra.