HyperAI

Image Captioning On Nocaps Out Of Domain

Metrics

CIDEr
SPICE

Results

Performance results of various models on this benchmark

Comparison Table
Model NameCIDErSPICE
clipcap-clip-prefix-for-image-captioning49.149.57
Model 221.37.2
clipcap-clip-prefix-for-image-captioning49.359.7
Model 472.1311.53
Model 530.098.08
Model 626.557.72
Model 758.488.77
Model 830.098.08
vivo-surpassing-human-performance-in-novel110.1413.74
Model 1071.4310.57
Model 1148.738.2
Model 1288.5413.87
Model 1370.2110.15
Model 14103.7513.75
Model 1585.1811.18
Model 1668.9210.05
Model 1787.5112.52
Model 1877.3911.59
Model 1923.077.4
Model 2068.510.01
Model 2154.569.9
git-a-generative-image-to-text-transformer122.2715.62
vinvl-making-visual-representations-matter-in78.0111.48
simvlm-simple-visual-language-model109.4913.89
Model 2526.257.52
Model 2691.6214.21
Model 2787.1511.43
Model 28121.6915.13
Model 2936.129.39
git-a-generative-image-to-text-transformer122.0415.7
Model 3139.397.62
Model 3275.3910.68
Model 3366.679.74
Model 3443.29.35
Model 3578.9112.14
Model 3625.917.61
Model 3773.759.72
pali-a-jointly-scaled-multilingual-language126.6715.49
Model 39106.5514.21
grit-faster-and-better-image-captioning72.611.1