seed-tts-eval
S
Seed Tts Eval
Overview :
seed-tts-eval is a testing dataset for evaluating a model's zero-shot speech generation capability. It provides an objective evaluation test set across diverse domains, containing samples extracted from both English and Mandarin public language repositories. This dataset is used to measure the model's performance across various objective metrics. It utilizes 1000 samples from the Common Voice dataset and 2000 samples from the DiDiSpeech-2 dataset.
Target Users :
This dataset is designed for researchers and developers in the field of speech synthesis. They can leverage the seed-tts-eval model to evaluate and refine their speech synthesis systems.
Total Visits: 474.6M
Top Region: US(19.34%)
Website Views : 128.9K
Use Cases
Researchers utilize seed-tts-eval to assess the performance of novel speech synthesis models.
Developers leverage this test set to compare the effectiveness of various speech synthesis techniques.
Educational institutions employ this test set as a teaching resource to instruct on speech synthesis technologies.
Features
Evaluation using samples from the Common Voice and DiDiSpeech-2 datasets
Utilization of Word Error Rate (WER) and Speaker Similarity (SIM) as evaluation metrics
Employment of Whisper-large-v3 and Paraformer-zh as automatic speech recognition engines for English and Mandarin, respectively
Use of the WavLM-large model for speaker similarity evaluation
Provision of a download link for the test set
Support for evaluating zero-shot text-to-speech (TTS) and voice conversion (VC) tasks
How to Use
Visit the seed-tts-eval GitHub page.
Read the README file to understand how to install dependencies and use the test set.
Download the required test set samples.
Use the provided evaluation code to assess the model's performance.
Optimize the speech synthesis model based on the evaluation results.
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase