GAIA
G
GAIA
Overview :
GAIA aims to synthesize natural conversational videos from voice and a single portrait image. We introduce GAIA (Generative Avatar AI) which eliminates domain priors in conversational avatar generation. GAIA consists of two stages: 1) decomposing each frame into motion and appearance representations; 2) generating a motion sequence conditioned on voice and a reference portrait image. We collected a large-scale high-quality conversational avatar dataset and trained the model at different scales. Experimental results validate GAIA's superiority, scalability, and flexibility. The methods include variational autoencoders (VAEs) and diffusion models. Diffusion models are optimized to generate motion sequences conditioned on a voice sequence and random frames in a video clip. GAIA can be used for various applications such as controllable conversational avatar generation and text-guided avatar generation.
Target Users :
Can be used to generate natural conversational video avatars, suitable for research and development of AI/ML technologies.
Total Visits: 934.0K
Top Region: US(19.93%)
Website Views : 69.0K
Use Cases
Voice-Driven Conversational Avatar Generation
Video-Driven Conversational Avatar Generation
Text-Guided Avatar Generation
Features
Voice-Driven Conversational Avatar Generation
Video-Driven Conversational Avatar Generation
Pose-Controllable Conversational Avatar Generation
Fully Controllable Conversational Avatar Generation
Text-Guided Avatar Generation
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase