

Synthesizing Moving People With 3D Control
Overview :
This product is based on a diffusion model framework, designed to generate 3D motion sequences of humans from a single image. Its core components include learning priors about the unseen parts of the human body and clothing, and rendering new body poses with appropriate clothing and textures. We train the model in the texture map space to make it invariant to pose and viewpoint, thus more efficient. Additionally, we develop a diffusion-based rendering pipeline controlled by 3D human poses that produces realistic human renderings. Our method can generate image sequences that align with 3D pose targets while visually resembling the input image. The 3D control also allows for generating various synthetic camera trajectories to render human figures. Experiments demonstrate that our approach can generate image sequences of continuous motion and complex poses, outperforming previous methods.
Target Users :
Used for generating realistic human motions, applicable in fields like film special effects, game development, and virtual reality.
Use Cases
In film special effects production, use the 3D human motion synthesis technology to generate realistic character animations.
In game development, utilize 3D human motion synthesis technology to create realistic game character animations.
In virtual reality applications, employ 3D human motion synthesis technology to achieve realistic virtual character performances.
Features
Single-image generation of realistic human motion
Learning priors about the unseen parts of the human body and clothing
Rendering new body poses, including realistic fill-in for clothing, hair, and unseen areas
3D-controlled rendering pipeline
Generating image sequences that conform to 3D pose targets
Featured AI Tools

Sora
AI video generation
17.0M

Animate Anyone
Animate Anyone aims to generate character videos from static images driven by signals. Leveraging the power of diffusion models, we propose a novel framework tailored for character animation. To maintain consistency of complex appearance features present in the reference image, we design ReferenceNet to merge detailed features via spatial attention. To ensure controllability and continuity, we introduce an efficient pose guidance module to direct character movements and adopt an effective temporal modeling approach to ensure smooth cross-frame transitions between video frames. By extending the training data, our method can animate any character, achieving superior results in character animation compared to other image-to-video approaches. Moreover, we evaluate our method on benchmarks for fashion video and human dance synthesis, achieving state-of-the-art results.
AI video generation
11.4M