HyperAI

Image Captioning On Nocaps Out Of Domain

Métriques

CIDEr
SPICE

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
CIDEr
SPICE
Paper TitleRepository
ClipCap (Transformer)49.149.57ClipCap: CLIP Prefix for Image Captioning
CS395T21.37.2--
ClipCap (MLP + GPT2 tuning)49.359.7ClipCap: CLIP Prefix for Image Captioning
ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS72.1311.53--
UpDown30.098.08--
area_attention26.557.72--
Neural Baby Talk + CBS58.488.77--
nocaps_training30.098.08--
Microsoft Cognitive Services team110.1413.74VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning-
vinvl_yuan_cbs71.4310.57--
Neural Baby Talk48.738.2--
firethehole88.5413.87--
UpDown-C70.2110.15--
FudanWYZ103.7513.75--
evertyhing85.1811.18--
Xinyi68.9210.05--
IEDA-LAB87.5112.52--
MD77.3911.59--
coco_all_1923.077.4--
cxy_nocaps_training68.510.01--
0 of 40 row(s) selected.