

Masked Diffusion Transformer (MDT)
Overview :
MDT explicitly enhances the ability of diffusion probability models (DPMs) to learn relationships between object parts in images by introducing a masked latent model scheme. MDT operates in the latent space during training, masking certain tokens, and then designs an asymmetrical diffusion transformer to predict masked tokens from unmasked tokens while maintaining the diffusion generation process. MDTv2 further improves the performance of MDT through more efficient macro network structures and training strategies.
Target Users :
Suitable for researchers and developers who require high-quality image synthesis, particularly in the fields of image generation and deep learning.
Use Cases
Generate high-resolution images using MDT
Achieve fast learning in image synthesis tasks
Utilize MDTv2 to improve the FID score of image synthesis
Features
Image Synthesis
Masked Latent Model Scheme
Asymmetrical Diffusion Transformer
Efficient Macro Network Structure and Training Strategy
Featured AI Tools
Chinese Picks

Capcut Dreamina
CapCut Dreamina is an AIGC tool under Douyin. Users can generate creative images based on text content, supporting image resizing, aspect ratio adjustment, and template type selection. It will be used for content creation in Douyin's text or short videos in the future to enrich Douyin's AI creation content library.
AI image generation
9.0M

Outfit Anyone
Outfit Anyone is an ultra-high quality virtual try-on product that allows users to try different fashion styles without physically trying on clothes. Using a two-stream conditional diffusion model, Outfit Anyone can flexibly handle clothing deformation, generating more realistic results. It boasts extensibility, allowing adjustments for poses and body shapes, making it suitable for images ranging from anime characters to real people. Outfit Anyone's performance across various scenarios highlights its practicality and readiness for real-world applications.
AI image generation
5.3M