Salesforce, an enterprise software company, has unveiled two compact AI models that challenge the “bigger is better” paradigm in AI.
Despite their compact size, the 1- and 7-billion parameter xLAM models outperform many larger models in function-calling tasks.
These tasks involve an AI system interpreting and translating a natural language request into specific function calls or API requests.
For example, if you ask an AI system to “find flights to New York for next weekend under $500,” the model needs to understand this request, identify the relevant functions (e.g., search_flights, filter_by_price), and execute them with the correct parameters.
“We demonstrate that models trained with our curated datasets, even with only 7B parameters, can achieve state-of-the-art performance on the Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models,” the researchers write in their paper.
“Moreover, our 1B model achieves exceptional performance, surpassing GPT-3.5-Turbo and Claude-3 Haiku.”
The Berkeley Function-Calling Benchmark, referenced in the study, is an evaluation framework designed to assess the function-calling capabilities of AI models.
Key stats from the study include:
- The xLAM-7B model (7 billion parameters) ranked 6th on the Berkeley Function-Calling Leaderboard, outperforming GPT-4 and Gemini-1.5-Pro.
- The smaller xLAM-1B model outperformed larger models like Claude-3 Haiku and GPT-3.5-Turbo, demonstrating exceptional efficiency.
What makes this achievement particularly impressive is the model’s size compared to its competitors:
- xLAM-1B: 1 billion parameters
- xLAM-7B: 7 billion parameters
- GPT-3: 175 billion parameters
- GPT-4: Estimated 1.7 trillion parameters
- Claude-3 Opus: Undisclosed, but likely hundreds of billions
- Gemini Ultra: Undisclosed, estimated similar to GPT-4
This shows that efficient design and high-quality training data can be more important than sheer size.
Meet Salesforce Einstein “Tiny Giant.” Our 1B parameter model xLAM-1B is now the best micro model for function calling, outperforming models 7x its size, including GPT-3.5 & Claude. On-device agentic AI is here. Congrats Salesforce Research!
Paper: https://t.co/SrntYvgxR5… pic.twitter.com/pPgIzk82xT
— Marc Benioff (@Benioff) July 3, 2024
To train the model specifically for function-calling, the Salesforce team developed APIGen, a pipeline for creating diverse, high-quality datasets for function-calling tasks.
APIGen works by sampling from a vast library of 3,673 executable APIs across 21 categories, creating realistic scenarios for the AI to learn from.
Potential applications of xLAM-1B’s capabilities include enhanced customer relationship management (CRM) systems, which Salesforce develops, more capable digital assistants, improved interfaces for smart home devices, efficient AI processing for autonomous vehicles, and real-time language translation on edge devices.
These xLAM models challenge researchers to rethink their AI architecture and training approaches by demonstrating that smaller, more efficient models can compete with larger ones.
As Salesforce CEO Marc Benioff explained, Tiny Giant highlights the potential for “on-device agentic AI,” perfect for smartphones and IoT devices.
The future of AI will not just involve ever-larger models but smarter, more efficient ones that can bring advanced features to a broader range of devices and applications.