DeepMind demo SIMA, a generalist AI agent for 3D environments

March 14, 2024

AI deepMind

Imagine an AI that doesn’t just understand commands but applies them, like a human would, across an array of simulated 3D environments. 

That’s the aim of DeepMind’s (Scalable, Instructable, Multiworld Agent (SIMA). 

Unlike traditional AI, which might excel in discrete tasks like strategic games or specific problem-solving, SIMA’s agents are trained to interpret human language instructions and translate them into actions using a keyboard and mouse, mimicking human interaction with a computer.

This means that whether the task is to navigate through a digital landscape, solve puzzles, or interact with objects in a game, SIMA aims to understand and execute these commands with the same intuition and adaptability as a person would.


This project’s core is a huge and diverse dataset of human gameplay across research environments and commercial video games. 

SIMA was trained and tested on a selection of nine video games through collaborations with eight game studios, including well-known titles like No Man’s Sky and Teardown. Each game challenges SIMA with different skills, from basic navigation and resource gathering to more complex activities like crafting and spaceship piloting.

SIMA’s training included four research environments to assess its physical interaction and object manipulation skills.

In terms of architecture, SIMA uses pre-trained vision and video prediction models, fine-tuned on the specific 3D settings of its game portfolio. 

Unlike traditional game-playing AIs, SIMA doesn’t require source code access or custom APIs. It operates on-screen images and user-provided instructions, employing keyboard and mouse actions to execute tasks. 

In its evaluation phase, SIMA demonstrated proficiency across 600 basic skills encompassing navigation, object interaction, and menu use. 

What sets SIMA apart is its generality. This AI isn’t being trained to master a single game or solve a particular set of problems.

Instead, DeepMind is teaching it to be adaptable, to understand instructions, and to act on them across different virtual worlds. 

Tim Harley from DeepMind explained, “It’s still very much a research project,” but in the future, “one could imagine one day having agents like SIMA playing alongside you in games with you and with your friends.”


SIMA is mastering the art of understanding and acting upon our instructions by grounding language in perception and action. 

DeepMind has plenty of gaming heritage stretching back to AlphaGo in 2014, which went on to beat several high-profile players of the famously complex Asian game Go.

However, SIMA goes deeper than video games, moving closer to the dream of truly intelligent, instructable AI agents that blur the lines between human and machine understanding. 

Join The Future


SUBSCRIBE TODAY

Clear, concise, comprehensive. Get a grip on AI developments with DailyAI

Sam Jeans

Sam is a science and technology writer who has worked in various AI startups. When he’s not writing, he can be found reading medical journals or digging through boxes of vinyl records.

×

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions