Runway unveiled its latest text-to-video (T2V) generator, called Gen 3 Alpha, and the demos hint that this could be the best AI video generator yet.
OpenAI’s Sora wowed us a few months ago but there’s still no word on when (or if) it will be released. Runway already allows free and paid access to its previous generation Gen 2 T2V tool.
Gen 2 makes some decent videos, but it’s a little hit-or-miss and often generates weird anatomy or clunky movements when generating people.
Gen 3 Alpha delivers hyperrealistic video with smooth motion and coherent human models.
Runway says, “Gen-3 Alpha excels at generating expressive human characters with a wide range of actions, gestures, and emotions, unlocking new storytelling opportunities.”
Introducing Gen-3 Alpha: Runway’s new base model for video generation.
Gen-3 Alpha can create highly detailed videos with complex scene changes, a wide range of cinematic choices, and detailed art directions.https://t.co/YQNE3eqoWf
(1/10) pic.twitter.com/VjEG2ocLZ8
— Runway (@runwayml) June 17, 2024
The improved fidelity comes with a speed upgrade too, with the maximum length 10-second clips generated in just 90 seconds. The 10-second clip limit is the same as Sora, twice that of Luma, and three times that of Runway’s Gen 2.
Besides the improved human representations, the accurate physics of the videos is truly impressive.
And to think that this video is 100% generated by AI, it’s total madness the news we have about AI videos these days. #Runway Gen-3 🔥🔥 pic.twitter.com/FLC5TGfYzr
— Pierrick Chevallier | IA (@CharaspowerAI) June 17, 2024
Runway says Gen 3 Alpha will power improved control modes that allow a user to select specific elements to have motion and detailed camera movement controls with “upcoming tools for more fine-grained control over structure, style, and motion.”
The degree of camera control gives you an idea of how close we are to the end of traditional movie production.
Prompt: Handheld camera moving fast, flashlight light, in a white old wall in a old alley at night a black graffiti that spells ‘Runway’.
(10/10) pic.twitter.com/xRreX33g0r
— Runway (@runwayml) June 17, 2024
OpenAI previously hinted that alignment concerns are one of the reasons it hasn’t released Sora yet. Runway says Gen 3 Alpha comes with a new set of safeguards and C2PA which allows the provenance of generated video to be tracked.
General world models
The idea of turning text into videos will appeal to most users, but Runway says Gen 3 Alpha represents a step towards a different goal.
Runway says, “We believe the next major advancement in AI will come from systems that understand the visual world and its dynamics, which is why we’re starting a new long-term research effort around what we call general world models.”
Training an embodied AI to navigate and interact with an environment is a lot faster and cheaper when simulated. For the simulation to be useful it needs to accurately represent the physics and motion of real-world environments.
Runway says these general world models “need to capture not just the dynamics of the world, but the dynamics of its inhabitants, which involves also building realistic models of human behavior.”
The coherent motion, physics, human features, and emotions in the Gen 3 demo videos are evidence of a big step towards making this possible.
OpenAI has almost certainly been working on an upgraded Sora, but with Runway’s Gen 3 Alpha, the race for best AI video generator just got a lot more competitive.
There’s no word on when Gen 3 Alpha will be released but you can see more demos here or experiment with Gen 2 here for now.