MCVD : A general-purpose model for video generation, prediction, and interpolation

MCVD

AI video generation #Video Generation #Video Prediction #Video Interpolation #2D Convolutional Standard Picks Open Source

Overview :

MCVD is a general-purpose model for video generation, prediction, and interpolation. It utilizes a score-based diffusion loss function to generate novel frames by injecting Gaussian noise into the current frame and conditioning on past and/or future frames for denoising. Training involves randomly masking past and/or future frames to achieve four capabilities: unconditional generation, future prediction, past reconstruction, and interpolation. The model employs a 2D convolutional U-Net architecture that conditions on past and future frames using concatenated or spatiotemporal adaptive normalization, resulting in high-quality and diverse video samples. Trained on 1-4 GPUs, it can be scaled to more channels. MCVD, a simple non-recursive 2D convolutional architecture, generates videos of arbitrary lengths and achieves SOTA results.

Target Users :

Video Generation, Prediction, and Interpolation

Total Visits： 0

Website Views ： 50.8K

Use Cases

Movie Effect Generation

Video Game Development

Animation Production

Features

Video Generation

Video Prediction

Video Interpolation