Google’s DeepMind worked with 33 different academic labs to create an AI training dataset based on 22 different robot types.
Robots are really good at doing one specific thing. If you want it to do something even slightly different, the robot needs to be trained from scratch. The ultimate goal for robotics is to have a robot that is good at a general range of actions with the ability to learn new skills by itself.
To train an AI model you need a large dataset of data related to the purpose of the model. Language models like GPT-4 are trained on vast amounts of written data. Image generators like DALL-E 3 are trained on large amounts of images.
With X-Embodiment, DeepMind has created a dataset of robotic actions based on 22 different types of robots. It then used that dataset to train new models based on its RT-1 and RT-2 robotic models.
The data for X-Embodiment was derived from “22 robot embodiments, demonstrating more than 500 skills and 150,000 tasks across more than 1 million episodes,” according to DeepMind’s post.
Introducing 𝗥𝗧-𝗫: a generalist AI model to help advance how robots can learn new skills. 🤖
To train it, we partnered with 33 academic labs across the world to build a new dataset with experiences gained from 22 different robot types.
Find out more: https://t.co/k6tE62gQGP pic.twitter.com/IXTy2g4Lty
— Google DeepMind (@GoogleDeepMind) October 3, 2023
The earlier test results of the RT-1 and RT-2 models were already impressive but DeepMind found that the RT-X versions performed significantly better due to the general nature of the new dataset.
Testing involved comparing a robot controlled by a model trained for a specific task with that same robot controlled by the RT-1-X model. RT-1-X performed on average 50% better than the models designed specifically for tasks like opening a door or routing a cable.
RT-2, Google’s vision-language-action (VLA) robotic model, allows robots to learn from web, verbal, and visual data and then act without being trained. When engineers trained RT-2-X with the X-Embodiment dataset they found that RT-2-X was three times as successful as RT-2 for emergent skills.
In other words, the robot was learning new skills it didn’t have before, based on abilities that other robots had contributed to the dataset. Skills transfer between different types of robots could be a game-changer for rapid robotics development.
These results are cause for optimism that we’ll soon see robots with more general skills as well as the ability to learn new ones without being specifically trained for them.
DeepMind says that this research could be applied to the self-improvement property of RoboCat, its self-improving AI agent for robotics.
The prospect of having a robot that continues to improve and learn new skills would be a huge advantage in fields like manufacturing, agriculture, or healthcare. Those new skills could equally be applied in the defense industry which is perhaps a less appealing, if inevitable prospect.