Unique3D is an open-source framework by Tsinghua University that converts a single image into a high-fidelity 3D model using multi-view and normal diffusion models.
What is Unique3D?
Unique3D is an open-source framework developed by Tsinghua University for converting a single image into a 3D model. It combines multi-view diffusion models and normal diffusion models with an efficient multi-level upsampling strategy to quickly generate high-fidelity 3D meshes with rich textures. The framework also integrates the ISOMER algorithm to ensure geometric and color consistency, achieving superior results compared to other image-to-3D models like InstantMesh, CRM, and OpenLRM.
Features of Unique3D
- Single Image 3D Mesh Generation: Unique3D can automatically generate 3D mesh models from a single 2D image, converting flat images into three-dimensional forms with spatial depth.
- Multi-View Image Generation: The system uses multi-view diffusion models to generate four orthogonal view images of the same object, capturing features from different angles to provide comprehensive perspective information for 3D reconstruction.
- Normal Map Generation: Unique3D generates corresponding normal maps for each multi-view image, which record the orientation of the object's surface. These maps are crucial for subsequent 3D model rendering, enhancing the model's realism by simulating how light interacts with the surface.
- Multi-Level Resolution Enhancement: Through a multi-level upsampling process, the resolution of the generated images is gradually increased from low to high (e.g., from 256×256 to 2048×2048), making the textures and details of the 3D model clearer.
- Integration of Geometry and Texture Details: During the reconstruction process, Unique3D tightly integrates color information with geometric shapes, ensuring that the generated 3D model visually matches the original 2D image while having complex geometric structures and rich texture details.
- High-Fidelity Output: The generated 3D models are highly consistent with the input 2D images in terms of shape, texture, and color, achieving high fidelity in both geometric accuracy and texture richness.
Official Links for Unique3D
Technical Principles of Unique3D
- Multi-View Diffusion Models: These models generate multi-view (typically four orthogonal views) images from a single-view image. They learn the distribution of 2D images through training and extend it to 3D space to generate images from different perspectives.
- Normal Diffusion Models: Working in conjunction with multi-view diffusion models, these generate corresponding normal maps for each view image. These normal maps contain surface normal direction information, which is crucial for subsequent 3D reconstruction.
- Multi-Level Upsampling Process: This strategy gradually increases the resolution of the generated images. Initially, the images are generated at a low resolution, and through upsampling techniques, the resolution is progressively increased to achieve clearer details.
- ISOMER Mesh Reconstruction Algorithm: An efficient mesh reconstruction algorithm used to reconstruct 3D meshes from high-resolution multi-view RGB images and normal maps. The ISOMER algorithm includes:
- Initial Mesh Estimation: Quickly generates a rough topology and initial mesh of the 3D object.
- Coarse-to-Fine Mesh Optimization: Through an iterative optimization process, the shape of the mesh is gradually improved to make it closer to the target shape.
- Explicit Target Optimization: Assigns an optimization target to each vertex to solve issues caused by inconsistent views and improve the accuracy of geometric details.
- Color and Geometry Prior Integration: During the mesh reconstruction process, color information and geometric shape information are integrated into the mesh results to enhance the visual realism and accuracy of the final model.
- Explicit Target (ExplicitTarget): Defines an optimization target for each vertex, which is a mapping function from a set of vertices to a set of colors, guiding the optimization of vertex colors to improve multi-view consistency.
- Expansion Regularization: A technique used during optimization to prevent surface collapse by moving vertices in the direction of their normals, ensuring the integrity of the model.
- Color Completion Algorithm: An efficient algorithm for color completion in unseen regions, smoothly propagating colors from visible regions to unseen regions to ensure color consistency across the entire model.