Audio2photoreal : Transform audio into photo-realistic human avatars

Audio2photoreal

AI head image generation AI image generation #Artificial Intelligence #Voice Synthesis #Image Generation #Avatar #Virtual Character Standard Picks Open Source

Overview :

audio2photoreal is an open-source project that generates photo-realistic avatars from audio. It includes a PyTorch implementation capable of synthesizing human images from dialogue in audio. The project provides training code, test code, pre-trained motion models, and access to datasets. Its models consist of facial diffusion models, body diffusion models, body VQ-VAE models, and body guiding transformer models. This project allows researchers and developers to train their own models and create high-quality, realistic avatars based on voice synthesis.

Target Users :

["Voice Character Image Synthesis","3D Avatar Generation","Voice-Driven CG Character","Metaverse Virtual Imagery"]

Total Visits： 474.6M

Top Region： US(19.34%)

Website Views ： 139.4K

Use Cases

Train models with your own collected voice data to generate custom character avatars

Synthesize realistic virtual imagery using voice recordings of historical figures

Adapt character voiceovers to 3D games and virtual spaces

Features

Generate realistic human avatars from audio

Provide pre-trained models and datasets