HyperAI
Accueil
Actualités
Articles de recherche récents
Tutoriels
Ensembles de données
Wiki
SOTA
Modèles LLM
Classement GPU
Événements
Recherche
À propos
Français
HyperAI
Toggle sidebar
Rechercher sur le site...
⌘
K
Accueil
SOTA
Text To Image Generation
Text To Image Generation On Coco
Text To Image Generation On Coco
Métriques
FID
Inception score
Résultats
Résultats de performance de divers modèles sur ce benchmark
Columns
Nom du modèle
FID
Inception score
Paper Title
Repository
AttnGAN+CL
23.93
25.70
Improving Text-to-Image Synthesis Using Contrastive Learning
FuseDream (few-shot, k=5)
21.16
34.26
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Corgi-Semi
10.6
-
Shifted Diffusion for Text-to-image Generation
-
Vanilla CM3
29.5
-
Retrieval-Augmented Multimodal Language Modeling
-
StackGAN-v1
74.05
8.45
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
StyleGAN-T (Zero-shot, 256x256)
13.9
-
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Re-Imagen (Finetuned)
5.25
-
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
-
GLIGEN (fine-tuned, Detection data only)
5.82
-
GLIGEN: Open-Set Grounded Text-to-Image Generation
DF-GAN (256 x 256)
-
18.7
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
DALL-E (256 x 256)
27.5
17.9
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Lafite
8.12
32.34
LAFITE: Towards Language-Free Training for Text-to-Image Generation
Make-a-Scene (unfiltered)
11.84
-
Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
-
DALL-E 2
10.39
-
Hierarchical Text-Conditional Image Generation with CLIP Latents
-
Imagen (zero-shot)
7.27
-
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
VQ-Diffusion-F
13.86
-
Vector Quantized Diffusion Model for Text-to-Image Synthesis
RAT-Diffusion
5.00
-
Data Extrapolation for Text-to-image Generation on Small Datasets
-
Lafite (zero-shot)
26.94
26.02
LAFITE: Towards Language-Free Training for Text-to-Image Generation
eDiff-I (zero-shot)
6.95
-
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
L-Verse
45.8
-
L-Verse: Bidirectional Generation Between Image and Text
GLIGEN (fine-tuned, Grounding data)
6.38
-
GLIGEN: Open-Set Grounded Text-to-Image Generation
0 of 69 row(s) selected.
Previous
Next