

Imagenhub
Overview :
ImagenHub is a one-stop repository for standardizing the inference and evaluation of all conditional image generation models. The project first defines seven prominent tasks and creates high-quality evaluation datasets. Second, we build a unified inference pipeline to ensure fair comparisons. Third, we design two human evaluation metrics, semantic consistency and perceptual quality, and establish comprehensive guidelines for evaluating generated images. We train expert reviewers to evaluate model outputs based on the proposed metrics. This human evaluation achieved high inter-rater consistency on 76% of the models. We comprehensively evaluated around 30 models and observed three key findings: (1) The performance of existing models is generally unsatisfactory, with 74% of models scoring lower than 0.5 overall except for text-guided image generation and theme-driven image generation. (2) We examined claims made in published papers and found 83% of the claims to be accurate. (3) Apart from theme-driven image generation, existing automatic evaluation metrics have no Spearman correlation coefficient higher than 0.2. In the future, we will continue to evaluate newly released models and update the rankings to track the progress of the conditional image generation field.
Target Users :
ImagenHub is a platform for standardized conditional image generation model evaluation. Researchers and developers can use it to fairly compare the performance of different models and track the progress of the field.
Use Cases
ImagenHub collects seven major conditional image generation tasks, including text-guided image generation, mask-guided image editing, theme-driven image generation, providing researchers with a comprehensive evaluation dataset.
ImagenHub establishes a unified inference pipeline, ensuring different models are compared fairly under the same evaluation process.
ImagenHub designs two human evaluation metrics, semantic consistency and perceptual quality, and trains expert reviewers to evaluate model outputs based on these metrics, achieving high inter-rater consistency.
Features
Defines seven major conditional image generation tasks
Builds high-quality evaluation datasets
Establishes a unified inference pipeline
Designs semantic consistency and perceptual quality human evaluation metrics
Trains expert reviewers for evaluation
Comprehensively evaluates around 30 conditional image generation models
Updates rankings to track field progress
Featured AI Tools
Chinese Picks

Capcut Dreamina
CapCut Dreamina is an AIGC tool under Douyin. Users can generate creative images based on text content, supporting image resizing, aspect ratio adjustment, and template type selection. It will be used for content creation in Douyin's text or short videos in the future to enrich Douyin's AI creation content library.
AI image generation
9.0M

Outfit Anyone
Outfit Anyone is an ultra-high quality virtual try-on product that allows users to try different fashion styles without physically trying on clothes. Using a two-stream conditional diffusion model, Outfit Anyone can flexibly handle clothing deformation, generating more realistic results. It boasts extensibility, allowing adjustments for poses and body shapes, making it suitable for images ranging from anime characters to real people. Outfit Anyone's performance across various scenarios highlights its practicality and readiness for real-world applications.
AI image generation
5.3M