HyperAI
HyperAI
الرئيسية
المنصة
الوثائق
الأخبار
الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
شروط الخدمة
سياسة الخصوصية
العربية
HyperAI
HyperAI
Toggle Sidebar
البحث في الموقع...
⌘
K
Command Palette
Search for a command to run...
المنصة
الرئيسية
SOTA
التعليق على الصور
Image Captioning On Coco
Image Captioning On Coco
المقاييس
BLEU-4
CIDEr
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
BLEU-4
CIDEr
Paper Title
ExpansionNet v2
-
143.7
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
M2 Transformer
-
131.2
Meshed-Memory Transformer for Image Captioning
IGINet
39.9
131.0
-
UNIMO-large
39.6
127.7
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
RDN
-
125.2
Reflective Decoding Network for Image Captioning
Lyrics
-
121.1
Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects
Bit Diffusion (20 steps)
34.7
115
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
Flamingo (80B; 4-shot)
-
103
Retrieval-Augmented Multimodal Language Modeling
RA-CM3 (2.7B)
-
89.1
Retrieval-Augmented Multimodal Language Modeling
Flamingo (3B; 4-shot)
-
85
Retrieval-Augmented Multimodal Language Modeling
Parti
-
83.9
Retrieval-Augmented Multimodal Language Modeling
NIC (ResNet-50, CutMix)
24.9
77.6
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Vanilla CM3
-
71.9
Retrieval-Augmented Multimodal Language Modeling
X-LXMERT
-
55.8
Retrieval-Augmented Multimodal Language Modeling
minDALL-E
-
48
Retrieval-Augmented Multimodal Language Modeling
ruDALL-E-XL
-
38.7
Retrieval-Augmented Multimodal Language Modeling
DALL-E
-
20.2
Retrieval-Augmented Multimodal Language Modeling
0 of 17 row(s) selected.
Previous
Next
Image Captioning On Coco | SOTA | HyperAI