HyperAI超神経
ホーム
ニュース
最新論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
ホーム
SOTA
Image Captioning
Image Captioning On Coco
Image Captioning On Coco
評価指標
BLEU-4
CIDEr
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
BLEU-4
CIDEr
Paper Title
Repository
UNIMO-large
39.6
127.7
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
DALL-E
-
20.2
Retrieval-Augmented Multimodal Language Modeling
-
Bit Diffusion (20 steps)
34.7
115
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
Flamingo (80B; 4-shot)
-
103
Retrieval-Augmented Multimodal Language Modeling
-
ruDALL-E-XL
-
38.7
Retrieval-Augmented Multimodal Language Modeling
-
minDALL-E
-
48
Retrieval-Augmented Multimodal Language Modeling
-
IGINet
39.9
131.0
-
-
Parti
-
83.9
Retrieval-Augmented Multimodal Language Modeling
-
X-LXMERT
-
55.8
Retrieval-Augmented Multimodal Language Modeling
-
RDN
-
125.2
Reflective Decoding Network for Image Captioning
-
ExpansionNet v2
-
143.7
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
Vanilla CM3
-
71.9
Retrieval-Augmented Multimodal Language Modeling
-
RA-CM3 (2.7B)
-
89.1
Retrieval-Augmented Multimodal Language Modeling
-
M2 Transformer
-
131.2
Meshed-Memory Transformer for Image Captioning
Flamingo (3B; 4-shot)
-
85
Retrieval-Augmented Multimodal Language Modeling
-
NIC (ResNet-50, CutMix)
24.9
77.6
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Lyrics
-
121.1
Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects
-
0 of 17 row(s) selected.
Previous
Next