HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
首页
SOTA
Text To Image Generation
Text To Image Generation On Coco
Text To Image Generation On Coco
评估指标
FID
Inception score
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
FID
Inception score
Paper Title
Repository
AttnGAN+CL
23.93
25.70
Improving Text-to-Image Synthesis Using Contrastive Learning
FuseDream (few-shot, k=5)
21.16
34.26
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Corgi-Semi
10.6
-
Shifted Diffusion for Text-to-image Generation
-
Vanilla CM3
29.5
-
Retrieval-Augmented Multimodal Language Modeling
-
StackGAN-v1
74.05
8.45
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
StyleGAN-T (Zero-shot, 256x256)
13.9
-
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Re-Imagen (Finetuned)
5.25
-
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
-
GLIGEN (fine-tuned, Detection data only)
5.82
-
GLIGEN: Open-Set Grounded Text-to-Image Generation
DF-GAN (256 x 256)
-
18.7
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
DALL-E (256 x 256)
27.5
17.9
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Lafite
8.12
32.34
LAFITE: Towards Language-Free Training for Text-to-Image Generation
Make-a-Scene (unfiltered)
11.84
-
Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
-
DALL-E 2
10.39
-
Hierarchical Text-Conditional Image Generation with CLIP Latents
-
Imagen (zero-shot)
7.27
-
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
VQ-Diffusion-F
13.86
-
Vector Quantized Diffusion Model for Text-to-Image Synthesis
RAT-Diffusion
5.00
-
Data Extrapolation for Text-to-image Generation on Small Datasets
-
Lafite (zero-shot)
26.94
26.02
LAFITE: Towards Language-Free Training for Text-to-Image Generation
eDiff-I (zero-shot)
6.95
-
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
L-Verse
45.8
-
L-Verse: Bidirectional Generation Between Image and Text
GLIGEN (fine-tuned, Grounding data)
6.38
-
GLIGEN: Open-Set Grounded Text-to-Image Generation
0 of 69 row(s) selected.
Previous
Next