

Seed Tts Eval
Overview :
seed-tts-eval is a testing dataset for evaluating a model's zero-shot speech generation capability. It provides an objective evaluation test set across diverse domains, containing samples extracted from both English and Mandarin public language repositories. This dataset is used to measure the model's performance across various objective metrics. It utilizes 1000 samples from the Common Voice dataset and 2000 samples from the DiDiSpeech-2 dataset.
Target Users :
This dataset is designed for researchers and developers in the field of speech synthesis. They can leverage the seed-tts-eval model to evaluate and refine their speech synthesis systems.
Use Cases
Researchers utilize seed-tts-eval to assess the performance of novel speech synthesis models.
Developers leverage this test set to compare the effectiveness of various speech synthesis techniques.
Educational institutions employ this test set as a teaching resource to instruct on speech synthesis technologies.
Features
Evaluation using samples from the Common Voice and DiDiSpeech-2 datasets
Utilization of Word Error Rate (WER) and Speaker Similarity (SIM) as evaluation metrics
Employment of Whisper-large-v3 and Paraformer-zh as automatic speech recognition engines for English and Mandarin, respectively
Use of the WavLM-large model for speaker similarity evaluation
Provision of a download link for the test set
Support for evaluating zero-shot text-to-speech (TTS) and voice conversion (VC) tasks
How to Use
Visit the seed-tts-eval GitHub page.
Read the README file to understand how to install dependencies and use the test set.
Download the required test set samples.
Use the provided evaluation code to assess the model's performance.
Optimize the speech synthesis model based on the evaluation results.
Featured AI Tools

Gemini
Gemini is the latest generation of AI system developed by Google DeepMind. It excels in multimodal reasoning, enabling seamless interaction between text, images, videos, audio, and code. Gemini surpasses previous models in language understanding, reasoning, mathematics, programming, and other fields, becoming one of the most powerful AI systems to date. It comes in three different scales to meet various needs from edge computing to cloud computing. Gemini can be widely applied in creative design, writing assistance, question answering, code generation, and more.
AI Model
11.4M
Chinese Picks

Liblibai
LiblibAI is a leading Chinese AI creative platform offering powerful AI creative tools to help creators bring their imagination to life. The platform provides a vast library of free AI creative models, allowing users to search and utilize these models for image, text, and audio creations. Users can also train their own AI models on the platform. Focused on the diverse needs of creators, LiblibAI is committed to creating inclusive conditions and serving the creative industry, ensuring that everyone can enjoy the joy of creation.
AI Model
6.9M