Keysync : An efficient leak-free lip-sync technology.

Keysync

Video Editing AI Models #Lip Sync #Video Processing #Artificial Intelligence #Deep Learning #Automatic Dubbing Standard Picks Open Source

Overview :

KeySync is a leak-free lip-sync framework for high-resolution videos. It addresses the issue of temporal consistency in traditional lip-sync technologies while using a clever masking strategy to handle expression leakage and facial occlusion. KeySync excels in its advanced results in lip reconstruction and cross-synchronization, applicable to practical scenarios such as automatic dubbing.

Target Users :

Suitable for researchers and developers, especially in fields like automated video production, game development, and film post-production. KeySync's leak-free lip-sync technology can improve video quality and user experience, making it ideal for high-quality content creators.

Total Visits： 485.5M

Top Region： US(19.34%)

Website Views ： 40.6K

Use Cases

Use KeySync in an automatic dubbing project to synchronize lip movements for animated characters.

Apply KeySync in video games to enhance the realism of character dialogues.

Improve audiovisual synchronization quality in film post-production using KeySync.

Features

Achieve high-quality lip sync to enhance visual effects.

Handle facial occlusions in videos for better practical application results.

Reduce expression leakage and evaluate it using the LipLeak metric.

Support various audio input formats, including Wav and Hubert.

Provide an interactive online demo for users to experience.

Offer local inference scripts suitable for long video processing.

Allow users to train custom models to meet different needs.

Include evaluation tools like LipScore for quality inspection.

How to Use

Create and activate a Conda environment: conda create -n KeySync python=3.11, conda activate KeySync.

Install necessary dependencies: python -m pip install -r requirements.txt --no-deps.

Download the pre-trained model: git lfs install, git clone https://huggingface.co/toninio19/keysync pretrained_models.

Prepare the data by placing video files in data/videos/ and audio files in data/audios/.

Run the inference script for lip-sync processing: bash scripts/infer_raw_data.sh --filelist 'data/videos' --file_list_audio 'data/audios' --output_folder 'my_animations'.