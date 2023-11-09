VentureBeat Presents: AI Unleashed – An Exclusive Executive Program for Enterprise Data Leaders. Hear from top industry leaders on November 15. Reserve your free pass

A team of researchers from Adobe Research and the Australian National University have developed a groundbreaking artificial intelligence (AI) model that can transform a 2D image into a high-quality 3D model in just 5 seconds.

This success is detailed in their research paper LRM: Large Reconstruction Model from Single Image to 3DCould revolutionize industries like gaming, animation, industrial design, augmented reality (AR), and virtual reality (VR).

“Imagine if we could instantly create a 3D shape from a single image of an arbitrary object. The widespread applications in industrial design, animation, gaming, and AR/VR have strongly motivated relevant research in search of a general and efficient approach to this long-term goal, the researchers wrote.

Credit: yiconghong.me/LRM/

Training with massive datasets

Unlike previous methods trained on small datasets in a category-specific fashion, LRM uses a highly scalable Transformer-based neural network architecture with over 500 million parameters. It is trained on approximately 1 million 3D objects from the Observer and MVIMGNet datasets in an end-to-end manner to predict neural radiance fields (NERFs) directly from the input image.

“This combination of high-powered models and massive training data enables our models to highly generalize and produce high-quality 3D reconstructions from a variety of test inputs, including real-world images from wild captures and generic models Are included.” The paper tells.

Credit: arxiv.org

Lead author, Yikong Hong, said that LRM represents a breakthrough in single-image 3D reconstruction. “To the best of our knowledge, LRM is the first large-scale 3D reconstruction model; It contains over 500 million learnable parameters, and has been trained on approximately one million 3D shapes and video data across different categories,” he said.

Experiments showed that LRM can reconstruct high-fidelity 3D models from real-world images as well as images created by AI generator models such as DALL-E and Stable Diffusion. The system generates detailed geometry and preserves complex textures such as wood grain.

Possibility of change in industries

The potential applications of LRMs are vast and exciting, spanning from practical uses in industry and design to entertainment and gaming. This can streamline the process of creating 3D models for video games or animations, reducing time and resource expenditure.

In industrial design, models can accelerate prototyping by creating accurate 3D models from 2D sketches. In AR/VR, LRM can enhance user experiences by generating detailed 3D environments from 2D images in real time.

Additionally, LRM’s ability to work with “in-the-wild” capture opens up possibilities for user-generated content and democratization of 3D modeling. Users can potentially create high quality 3D models from photos taken with their smartphones, opening up a world of creative and business opportunities.

Blurry textures are a problem, but the method advances the field

While promising, the researchers acknowledged that LRM has limitations such as producing blurry textures for occluded areas. But he said the work shows the promise of large Transformer-based models trained on huge datasets to learn generalized 3D reconstruction capabilities.

The paper concluded, “In the era of massive learning, we hope our idea can inspire future research to explore data-driven 3D large reconstruction models that generalize well to wild images.” Is.”

You can see more of LRM’s impressive capabilities, along with examples of high-fidelity 3D object meshes built from single images, on the team’s project page.

