ODIN Model : Single model implements 2D and 3D perception

ODIN Model

AI Model AI Image Detection and Recognition #computer vision #instance segmentation #3D perception #transformer architecture Standard Picks Open Source

Overview :

ODIN (Omni-Dimensional INstance segmentation) is a model that uses a transformer architecture for segmentation and labeling on both 2D RGB images and 3D point clouds. It distinguishes 2D and 3D feature operations by iteratively fusing information between 2D views and 3D views. ODIN achieves state-of-the-art performance on ScanNet200, Matterport3D, and AI2THOR 3D instance segmentation benchmarks, and achieves competitive performance on ScanNet, S3DIS, and COCO. When using sampled point clouds from 3D meshes instead of perceived 3D point clouds, it surpasses all previous works. As the 3D perception engine in a guided concretization agent architecture, it sets a new state-of-the-art on the TEACh dialogue action benchmark. Our code and checkpoints can be found on the project website.

Target Users :

ODIN can be used in fields such as computer vision, intelligent agent architectures, and instance segmentation.

Total Visits： 29.7M

Top Region： US(17.94%)

Website Views ： 47.2K

Use Cases

Using the ODIN model for 3D instance segmentation

Applying ODIN as the 3D perception engine in a concretization agent architecture

Conducting experiments using ODIN in computer vision research

Features

Segmentation and labeling on 2D RGB images and 3D point clouds

Distinguish 2D and 3D feature operations