Fastvideo : Open-source framework that accelerates large video diffusion models

Fastvideo

Video Production Development & Tools #Video Diffusion Models #Distillation #Inference Acceleration #Open-source Framework #High-Performance Computing Standard Picks Open Source

Overview :

FastVideo is an open-source framework designed to accelerate large video diffusion models. It offers two consistency distillation video diffusion models, FastHunyuan and FastMochi, achieving an 8x increase in inference speed. FastVideo introduces the first open video DiT distillation recipe based on PCM (Phased-Consistency-Model), supporting distillation, fine-tuning, and inference for state-of-the-art open video DiT models, including Mochi and Hunyuan. Additionally, FastVideo supports scalable training using FSDP, sequence parallelism, and selective activation checkpoints, as well as memory-efficient fine-tuning with LoRA, pre-computed latent variables, and pre-computed text embeddings. Ongoing development is highly experimental, with future plans to introduce more distillation methods, support additional models, and update the code.

Target Users :

The target audience includes researchers and developers in the video processing field, particularly professionals who need to handle large video diffusion models and seek to improve inference speed and efficiency. FastVideo assists users in achieving high-performance video processing tasks even with limited resources by providing efficient video diffusion models and distillation techniques.

Total Visits： 474.6M

Top Region： US(19.34%)

Website Views ： 52.7K

Use Cases

Researchers use the FastVideo framework to distill the Hunyuan model to enhance video generation speed and efficiency.

Developers utilize the FastMochi model provided by FastVideo for rapid video content generation and processing.

Educational institutions deploy the FastVideo framework for teaching and research on video diffusion models, improving students' learning effectiveness and experimental outcomes.

Features

? Supports FastHunyuan and FastMochi video diffusion models, achieving an 8x increase in inference speed.

? Provides PCM-based video DiT distillation recipes.

? Supports distillation, fine-tuning, and inference for state-of-the-art video DiT models such as Mochi and Hunyuan.

? Allows scalable training with FSDP, sequence parallelism, and selective activation checkpoints.

? Facilitates memory-efficient fine-tuning using LoRA, pre-computed latent variables, and pre-computed text embeddings.

? Offers downloads for pre-processed data and pretrained model weights to simplify user operations.

? Provides optional scripts for adversarial loss, though no significant improvements in performance have been observed.

How to Use

1. Install FastVideo: Follow the instructions on the GitHub page and run `./env_setup.sh fastvideo` to set up the environment.

2. Download model weights: Use the provided scripts to download the weights for FastHunyuan or FastMochi models.

3. Run inference: Depending on the model, execute the corresponding inference script, for example, `sh scripts/inference/inference_hunyuan.sh` for FastHunyuan model inference.

4. Distill models: Follow the documentation to download the original model weights and use `bash scripts/distill/distill_mochi.sh` or `bash scripts/distill/distill_hunyuan.sh` to perform model distillation.

5. Fine-tune models: Ensure that the data is prepared and pre-processed, then use `bash scripts/finetune/finetune_mochi.sh` for fine-tuning the Mochi model.

6. Check development plans and updates: Regularly visit the FastVideo GitHub page for the latest development plans and code updates.