GRM
G
GRM
Overview :
GRM is a large-scale reconstruction model that can recover 3D assets from sparse view images in 0.1 seconds and achieve generation in 8 seconds. It is a feed-forward Transformer-based model that can efficiently fuse multi-view information to convert input pixels into pixel-aligned Gaussian distributions. These Gaussian distributions can be back-projected into a dense 3D Gaussian distribution collection representing the scene. Our Transformer architecture and the use of 3D Gaussian distributions unlock a scalable and efficient reconstruction framework. Extensive experimental results demonstrate that our method surpasses other alternatives in terms of reconstruction quality and efficiency. We also showcase GRM's potential in generation tasks (such as text-to-3D and image-to-3D) by combining it with existing multi-view diffusion models.
Target Users :
3D Reconstruction, 3D Modeling, Text-to-3D, Image-to-3D, Visual Effects, Computer Graphics, etc.
Total Visits: 2.0K
Top Region: US(94.46%)
Website Views : 59.3K
Use Cases
Efficiently reconstruct a 3D model of an object from several photos.
Directly generate a corresponding 3D scene or object based on textual descriptions.
Reconstruct a 3D model of an object directly from a 2D image.
Features
High-quality, efficient 3D reconstruction (approx. 0.1 seconds)
Fast 3D generation (less than 8 seconds)
Reconstruction of 3D Gaussian distributions and meshes from various sources (e.g., Zero123++, Instant3D, V3D, SV3D)
Feed-forward Transformer-based model, efficiently fusing multi-view information
Using pixel-aligned Gaussian distributions and dense 3D Gaussian distributions to represent the scene
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase