

Celebv Text
Overview :
CelebV-Text is a large-scale, high-quality, and diverse face text-video dataset designed to promote research on face text-video generation tasks. The dataset contains 70,000 out-door face video clips, each accompanied by 20 text descriptions covering 40 general appearances, 5 detailed appearances, 6 lighting conditions, 37 actions, 8 emotions and 6 light directions. CelebV-Text has been validated through comprehensive statistical analysis for its superiority in video, text, and text-video correlation, and it constructs a benchmark to standardize the evaluation of face text-video generation tasks.
Target Users :
For research on face text-video generation tasks
Use Cases
Using the CelebV-Text dataset for research on face text-video generation tasks
Using the CelebV-Text dataset for analyzing face text-video correlation
Using the CelebV-Text dataset to construct a benchmark for face text-video generation tasks
Features
A large-scale face text-video dataset
70,000 outdoor face video clips
Each video clip is accompanied by 20 text descriptions
Covers 40 general appearances, 5 detailed appearances, 6 lighting conditions, 37 actions, 8 emotions, and 6 light directions
Comprehensive statistical analysis validates the superiority of the dataset
Constructs a benchmark to standardize the evaluation of face text-video generation tasks
Featured AI Tools

Celebv Text
CelebV-Text is a large-scale, high-quality, and diverse face text-video dataset designed to promote research on face text-video generation tasks. The dataset contains 70,000 out-door face video clips, each accompanied by 20 text descriptions covering 40 general appearances, 5 detailed appearances, 6 lighting conditions, 37 actions, 8 emotions and 6 light directions. CelebV-Text has been validated through comprehensive statistical analysis for its superiority in video, text, and text-video correlation, and it constructs a benchmark to standardize the evaluation of face text-video generation tasks.
AI Datasets
84.5K

Livefood
LiveFood is a dataset consisting of over 5100 gourmet videos, encompassing the four domains of ingredients, cooking, presentation, and consumption. All videos are meticulously annotated by professionals, and a strict double-checking system is employed to further ensure the quality of annotations. We have also proposed the Global Prototype Encoding (GPE) model to address the incremental learning problem, which achieves competitive performance compared to traditional techniques.
AI Datasets
66.5K