Lumiere
L
Lumiere
Overview :
Lumiere is a text-to-video diffusion model designed to synthesize videos that exhibit realistic, diverse, and coherent motion, addressing key challenges in video synthesis. We introduce a spatio-temporal U-Net architecture that enables the generation of an entire video's temporal duration in a single model pass. This contrasts with existing video models, which synthesize distant keyframes and then perform temporal super-resolution, a method that intrinsically makes global temporal consistency difficult to achieve. By deploying spatial and, importantly, temporal downsampling and upsampling, and leveraging a pre-trained text-to-image diffusion model, our model learns to directly generate full-frame rate, low-resolution videos at multiple spatio-temporal scales. We demonstrate state-of-the-art results in text-to-video generation and showcase that our design readily facilitates a variety of content creation tasks and video editing applications, including image-to-video, video repair, and style generation.
Target Users :
Suitable for video synthesis, image-to-video, video repair, style generation and other content creation and video editing applications.
Total Visits: 29.7M
Top Region: US(17.94%)
Website Views : 873.8K
Use Cases
Video synthesis application scenario examples
Image-to-video application scenario examples
Video repair application scenario examples
Features
Synthesize videos that exhibit realistic, diverse, and coherent motion
Generate an entire video's temporal duration in a single model pass
Readily facilitate a variety of content creation tasks and video editing applications
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase