

Make Your Anchor
Overview :
Make-Your-Anchor is a 2D virtual avatar generation framework based on diffusion models. It can automatically generate anchor-style videos with precise upper body and hand movements from just about 1 minute of video footage. The system employs a structure-guided diffusion model to render 3D mesh states into character appearances. Through a two-stage training strategy, it effectively binds movement with specific appearances. To generate videos of arbitrary length, the frame-wise diffusion model's 2D U-Net is extended to 3D form, and a simple and effective batch overlapping temporal denoising module is proposed, breaking through the video length limit during inference. Finally, a face enhancement module based on specific identities is introduced to improve the visual quality of the facial area in the output video. Experiments show that the system outperforms existing techniques in visual quality, temporal consistency, and identity faithfulness.
Target Users :
Generate full-body motion 2D virtual video avatars, suitable for video livestreaming, virtual anchors, and animation characters.
Features
Generate anchor-style videos from just 1 minute of video footage.
Precisely reproduce upper body and hand movements.
Structure-guided diffusion model renders 3D mesh into character appearances.
Two-stage training strategy binds movement with appearance.
3D U-Net and batch overlapping temporal denoising achieve arbitrary length video generation.
Specific identity face enhancement module enhances facial area visual quality.
Featured AI Tools

Sora
AI video generation
17.0M

Animate Anyone
Animate Anyone aims to generate character videos from static images driven by signals. Leveraging the power of diffusion models, we propose a novel framework tailored for character animation. To maintain consistency of complex appearance features present in the reference image, we design ReferenceNet to merge detailed features via spatial attention. To ensure controllability and continuity, we introduce an efficient pose guidance module to direct character movements and adopt an effective temporal modeling approach to ensure smooth cross-frame transitions between video frames. By extending the training data, our method can animate any character, achieving superior results in character animation compared to other image-to-video approaches. Moreover, we evaluate our method on benchmarks for fashion video and human dance synthesis, achieving state-of-the-art results.
AI video generation
11.4M