Text To Image Generation On Coco

评估指标

FID
Inception score

评测结果

各个模型在此基准测试上的表现结果

模型名称
FID
Inception score
Paper TitleRepository
AttnGAN+CL23.9325.70Improving Text-to-Image Synthesis Using Contrastive Learning
FuseDream (few-shot, k=5)21.1634.26FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Corgi-Semi10.6-Shifted Diffusion for Text-to-image Generation
Vanilla CM329.5-Retrieval-Augmented Multimodal Language Modeling-
StackGAN-v174.058.45StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
StyleGAN-T (Zero-shot, 256x256)13.9-StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Re-Imagen (Finetuned)5.25-Re-Imagen: Retrieval-Augmented Text-to-Image Generator-
GLIGEN (fine-tuned, Detection data only)5.82-GLIGEN: Open-Set Grounded Text-to-Image Generation
DF-GAN (256 x 256)-18.7NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
DALL-E (256 x 256)27.517.9NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Lafite8.1232.34LAFITE: Towards Language-Free Training for Text-to-Image Generation
Make-a-Scene (unfiltered)11.84-Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
DALL-E 210.39-Hierarchical Text-Conditional Image Generation with CLIP Latents
Imagen (zero-shot)7.27-Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding-
VQ-Diffusion-F13.86-Vector Quantized Diffusion Model for Text-to-Image Synthesis
RAT-Diffusion5.00-Data Extrapolation for Text-to-image Generation on Small Datasets
Lafite (zero-shot)26.9426.02LAFITE: Towards Language-Free Training for Text-to-Image Generation
eDiff-I (zero-shot)6.95-eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
L-Verse45.8-L-Verse: Bidirectional Generation Between Image and Text
GLIGEN (fine-tuned, Grounding data)6.38-GLIGEN: Open-Set Grounded Text-to-Image Generation
0 of 69 row(s) selected.