In a collaboration between the Australian National University, the University of Oxford, and the Beijing Academy of AI, researchers have unveiled an AI system named “3D-GPT.”
This sophisticated string of AI agents enables the generation of 3D environments through simple text prompts.
The paper, available on arXiv, showcases a streamlined and user-friendly approach to 3D asset creation, contrasting to the convoluted workflows involved with traditional 3D modeling.
By simply describing an environment like a “snow-covered peak with bright sunshine in the background,” the AI system will interpret and flesh out the description of the environment and use that to generate code that can be passed into 3D computer graphics software like Blender.
3D-GPT breaks down complex 3D modeling tasks into manageable segments, delegating each segment to specialized AI agents.
The agent’s roles are as follows:
- Task dispatch agent: Interprets the text instructions provided by the user.
- Conceptualization agent: Enriches the initial description by filling in any missing details.
- Modeling agent: Sets the necessary parameters and generates code to manipulate 3D software like Blender.
Using this agent-based structure, 3D-GPT can interpret text prompts, augment descriptions with additional context, and create 3D assets that align closely with the user’s imagination.
Transforming text into 3D worlds
The paper describes how 3D-GPT can take a simple text prompt such as “a misty spring morning, where dew-kissed flowers dot a lush meadow surrounded by budding trees” and breathe life into it, creating a rich 3D scene complete with realistic graphics.
Although the technology has not reached the stage of photorealism, the results are promising.
The researchers are optimistic about the future, stating, “Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results, but also collaborates effectively with human designers.”
They believe that their system “highlights the potential of LLMs in 3D modeling, offering a basic framework for future advancements in scene generation and animation.”
As technologies such as the metaverse gain pace, tools like 3D-GPT could become indispensable.
Potential applications span many industries, including gaming, virtual reality, cinema, and multimedia experiences, making 3D content creation more efficient and accessible.
3D-GPT might ring alarm bells for video game designers and 3D modelers, who are already under attack from similar tools integrated into popular design platforms like Unity.