OpenDiT
O
Opendit
Overview :
OpenDiT is an open-source project providing a high-performance implementation of Diffusion Transformer (DiT) based on Colossal-AI. It is designed to enhance the training and inference efficiency of DiT applications, including text-to-video and text-to-image generation. OpenDiT achieves performance improvements through the following technologies: * GPU acceleration up to 80% and 50% memory reduction; * Core optimizations including FlashAttention, Fused AdaLN, and Fused layernorm; * Mixed parallelism methods such as ZeRO, Gemini, and DDP, along with model sharding for ema models to further reduce memory costs; * FastSeq: A novel sequence parallelism method particularly suitable for workloads like DiT, where activations are large but parameters are small. Single-node sequence parallelism can save up to 48% in communication costs and break through the memory limit of a single GPU, reducing overall training and inference time; * Significant performance improvements can be achieved with minimal code modifications; * Users do not need to understand the implementation details of distributed training; * Complete text-to-image and text-to-video generation workflows; * Researchers and engineers can easily use and adapt our workflows to real-world applications without modifying the parallelism part; * Training on ImageNet for text-to-image generation and releasing checkpoints.
Target Users :
Used to enhance the training and inference efficiency of DiT applications, including text-to-video and text-to-image generation.
Total Visits: 474.6M
Top Region: US(19.34%)
Website Views : 130.8K
Features
Fast and efficient DiT training and inference
FlashAttention, Fused AdaLN, and Fused layernorm core optimizations
ZeRO, Gemini, and DDP mixed parallelism methods
FastSeq: A novel sequence parallelism method
Complete text-to-image and text-to-video generation workflows
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase