HyperAI
Startseite
Neuigkeiten
Neueste Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Deutsch
HyperAI
Toggle sidebar
Seite durchsuchen…
⌘
K
Startseite
SOTA
Image Captioning
Image Captioning On Coco
Image Captioning On Coco
Metriken
BLEU-4
CIDEr
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
BLEU-4
CIDEr
Paper Title
Repository
UNIMO-large
39.6
127.7
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
DALL-E
-
20.2
Retrieval-Augmented Multimodal Language Modeling
-
Bit Diffusion (20 steps)
34.7
115
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
Flamingo (80B; 4-shot)
-
103
Retrieval-Augmented Multimodal Language Modeling
-
ruDALL-E-XL
-
38.7
Retrieval-Augmented Multimodal Language Modeling
-
minDALL-E
-
48
Retrieval-Augmented Multimodal Language Modeling
-
IGINet
39.9
131.0
-
-
Parti
-
83.9
Retrieval-Augmented Multimodal Language Modeling
-
X-LXMERT
-
55.8
Retrieval-Augmented Multimodal Language Modeling
-
RDN
-
125.2
Reflective Decoding Network for Image Captioning
-
ExpansionNet v2
-
143.7
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
Vanilla CM3
-
71.9
Retrieval-Augmented Multimodal Language Modeling
-
RA-CM3 (2.7B)
-
89.1
Retrieval-Augmented Multimodal Language Modeling
-
M2 Transformer
-
131.2
Meshed-Memory Transformer for Image Captioning
Flamingo (3B; 4-shot)
-
85
Retrieval-Augmented Multimodal Language Modeling
-
NIC (ResNet-50, CutMix)
24.9
77.6
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Lyrics
-
121.1
Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects
-
0 of 17 row(s) selected.
Previous
Next