HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Text-to-Image Generation
Text To Image Generation On Coco
Text To Image Generation On Coco
Metrics
FID
Inception score
Results
Performance results of various models on this benchmark
Columns
Model Name
FID
Inception score
Paper Title
StackGAN-v1
74.05
8.45
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
StackGAN + OP
55.30
12.12
Generating Multiple Objects at Spatially Distinct Locations
L-Verse
45.8
-
L-Verse: Bidirectional Generation Between Image and Text
L-Verse-CC
37.2
-
L-Verse: Bidirectional Generation Between Image and Text
AttnGAN (256 x 256)
35.2
23.3
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
AttnGAN + OP
33.35
24.76
Generating Multiple Objects at Spatially Distinct Locations
DM-GAN
32.64
30.49
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis
DM-GAN + VICTR
32.37
32.37
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Vanilla CM3
29.5
-
Retrieval-Augmented Multimodal Language Modeling
AttnGAN + VICTR
29.26
28.18
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
DALL-E (12B)
28
-
Retrieval-Augmented Multimodal Language Modeling
DALL-E (256 x 256)
27.5
17.9
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
CogView
27.1
18.2
CogView: Mastering Text-to-Image Generation via Transformers
CogView (256 x 256)
27.1
18.2
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Lafite (zero-shot)
26.94
26.02
LAFITE: Towards Language-Free Training for Text-to-Image Generation
DM-GAN (256 x 256)
26.0
32.2
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
OP-GAN
24.70
27.88
Semantic Object Accuracy for Generative Text-to-Image Synthesis
CogView2(6B, Finetuned)
24
-
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
AttnGAN+CL
23.93
25.70
Improving Text-to-Image Synthesis Using Contrastive Learning
FuseDream (k=10, 256)
21.89
34.67
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
0 of 69 row(s) selected.
Previous
Next
Text To Image Generation On Coco | SOTA | HyperAI