Meta releases Ego-Exo4D, a multimodal perception dataset

  • Ego-Exo4D is a dataset of videos of skilled human activity from first and third-person perspectives.
  • The foundational dataset aims to support research on video learning and multimodal perception.
  • The dataset could enable the training of robots or teach humans new skills using augmented reality.

Training AI models like GPT-4 has relied mostly on datasets consisting of text and images. Meta’s Ego-Exo4D multimodal perception dataset presents data scientists with a rich new set of training data.

You can learn a new skill by reading a book, but it’s so much easier when someone shows you how to do something while explaining it to you. This is the goal Meta’s FAIR (Fundamental Artificial Intelligence Research) team has for Ego-Exo4D.

The dataset consists of first-person (Ego) and third-person (Exo) perspective videos of people performing different skilled human activities. These could be anything from cooking, dancing, playing music, or repairing a bicycle. The data was collected across 13 cities worldwide by 839 camera wearers, capturing 1422 hours of video.

The videos, which are filmed simultaneously, are then augmented with additional modes of data courtesy of Meta’s Project Aria glasses.

Project Aria glasses are wearable computers in glasses form. They capture the wearer’s video and audio, as well as their eye tracking and location information. The glasses also sense head poses and 3D point clouds of the environment.

The result is a dataset of simultaneous videos of a task being performed, with first-person narrations by the camera wearers describing their actions, and head and eye tracking of the person performing the task.

Meta then added third-person play-by-play descriptions of every camera wearer’s actions. Meta also hired experts in multiple fields to add third-person spoken expert commentary critiquing the way the person in the video performed the task.

By collecting both egocentric and exocentric views, the Ego-Exo4D dataset can show researchers what activities look like from different perspectives. This could help them eventually develop computer vision algorithms that can recognize what a person is doing from any perspective.

Ego-Exo4D opens new learning opportunities

One of the key obstacles to achieving AGI or training robots more efficiently is the lack of sensory perception that computers have. As humans, we have so many sensory inputs from our environment that we often take for granted as we learn new skills.

Ego-Exo4D will be an extremely useful resource to help bridge this gap.

Dr. Gedas Bertasius, Assistant Professor in the Department of Computer Science at the University of North Carolina said, “Ego-Exo4D isn’t just about collecting data, it’s about changing how AI understands, perceives, and learns. With human-centric learning and perspective, AI can become more helpful in our daily lives, assisting us in ways we’ve only imagined.”

Ego-Exo4D training data snapshot from bike repair example. Source: Meta

Meta says it hopes Ego-Exo4D will “enable robots of the future that gain insight about complex dexterous manipulations by watching skilled human experts in action.”

This dataset combined with Project Aria glasses will soon also enable a truly immersive learning experience for humans. Imagine performing a task while your glasses use augmented reality (AR) to overlay a tutorial video or talk you through your task.

You could be learning to play the piano and have a visual overlay showing you where your hands should move with real-time audio advice as you do it. Or you could pop the hood of your car and be guided in troubleshooting and fixing an engine problem.

It will be interesting to see if Meta’s Ego How-To learning concept will drive better adoption of Project Aria glasses than the failed Google Glass product experienced. There’s no word on when they’ll be available to purchase yet though.

Meta will make the Ego-Exo4D dataset available for download before the end of December.

© 2023 Intelliquence Ltd. All Rights Reserved.

Privacy Policy | Terms and Conditions

×
 
 

FREE PDF EXCLUSIVE
Stay Ahead with DailyAI


 

Sign up for our weekly newsletter and receive exclusive access to DailyAI's Latest eBook: 'Mastering AI Tools: Your 2024 Guide to Enhanced Productivity'.



 
 

*By subscribing to our newsletter you accept our Privacy Policy and our Terms and Conditions