

Opendit
Overview :
OpenDiT is an open-source project providing a high-performance implementation of Diffusion Transformer (DiT) based on Colossal-AI. It is designed to enhance the training and inference efficiency of DiT applications, including text-to-video and text-to-image generation. OpenDiT achieves performance improvements through the following technologies:
* GPU acceleration up to 80% and 50% memory reduction;
* Core optimizations including FlashAttention, Fused AdaLN, and Fused layernorm;
* Mixed parallelism methods such as ZeRO, Gemini, and DDP, along with model sharding for ema models to further reduce memory costs;
* FastSeq: A novel sequence parallelism method particularly suitable for workloads like DiT, where activations are large but parameters are small. Single-node sequence parallelism can save up to 48% in communication costs and break through the memory limit of a single GPU, reducing overall training and inference time;
* Significant performance improvements can be achieved with minimal code modifications;
* Users do not need to understand the implementation details of distributed training;
* Complete text-to-image and text-to-video generation workflows;
* Researchers and engineers can easily use and adapt our workflows to real-world applications without modifying the parallelism part;
* Training on ImageNet for text-to-image generation and releasing checkpoints.
Target Users :
Used to enhance the training and inference efficiency of DiT applications, including text-to-video and text-to-image generation.
Features
Fast and efficient DiT training and inference
FlashAttention, Fused AdaLN, and Fused layernorm core optimizations
ZeRO, Gemini, and DDP mixed parallelism methods
FastSeq: A novel sequence parallelism method
Complete text-to-image and text-to-video generation workflows
Featured AI Tools
Chinese Picks

Capcut Dreamina
CapCut Dreamina is an AIGC tool under Douyin. Users can generate creative images based on text content, supporting image resizing, aspect ratio adjustment, and template type selection. It will be used for content creation in Douyin's text or short videos in the future to enrich Douyin's AI creation content library.
AI image generation
9.0M

Outfit Anyone
Outfit Anyone is an ultra-high quality virtual try-on product that allows users to try different fashion styles without physically trying on clothes. Using a two-stream conditional diffusion model, Outfit Anyone can flexibly handle clothing deformation, generating more realistic results. It boasts extensibility, allowing adjustments for poses and body shapes, making it suitable for images ranging from anime characters to real people. Outfit Anyone's performance across various scenarios highlights its practicality and readiness for real-world applications.
AI image generation
5.3M