Image Captioning On Nocaps In Domain
Metrics
B1
B2
B3
B4
CIDEr
METEOR
ROUGE-L
SPICE
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | B1 | B2 | B3 | B4 | CIDEr | METEOR | ROUGE-L | SPICE |
---|---|---|---|---|---|---|---|---|
git-a-generative-image-to-text-transformer | 88.86 | 75.86 | 59.94 | 41.1 | 124.18 | 33.83 | 63.82 | 16.36 |
Model 2 | 84.03 | 69.12 | 51.16 | 33.15 | 100.03 | 30.06 | 59.67 | 14.08 |
Model 3 | 87.27 | 74.29 | 58.01 | 39.24 | 117.9 | 33.01 | 63.12 | 15.49 |
Model 4 | 75.31 | 56.79 | 37.85 | 21.91 | 73.73 | 26.02 | 52.44 | 12.04 |
Model 5 | 84.4 | 69.8 | 51.89 | 32.86 | 102.64 | 30.43 | 60.07 | 14.47 |
pali-a-jointly-scaled-multilingual-language | - | - | - | - | 149.1 | - | - | - |
Model 7 | 81.61 | 63.74 | 43.22 | 24.82 | 84.79 | 27.27 | 55.03 | 12.3 |
Model 8 | 79.58 | 63.09 | 43.92 | 26.07 | 87.86 | 27.97 | 55.88 | 12.6 |
Model 9 | 72.24 | 51.88 | 29.57 | 14.54 | 58.93 | 22.04 | 49.05 | 8.91 |
Model 10 | 81.64 | 63.79 | 43.43 | 25.15 | 85.81 | 27.25 | 55.06 | 12.35 |
Model 11 | 82.91 | 68.02 | 50.75 | 33.59 | 104.25 | 31.33 | 59.67 | 14.85 |
git-a-generative-image-to-text-transformer | 88.55 | 76.1 | 60.53 | 41.65 | 122.4 | 33.41 | 64.02 | 16.18 |
Model 13 | 76.48 | 58.76 | 39.28 | 21.96 | 69.59 | 25.08 | 53.22 | 10.94 |
Model 14 | 77.68 | 60.34 | 41.5 | 24.57 | 74.27 | 26.04 | 54.42 | 11.47 |
Model 15 | 82.9 | 68.09 | 49.73 | 31.24 | 96.63 | 29.37 | 58.62 | 13.61 |
simvlm-simple-visual-language-model | 84.64 | 70.0 | 52.96 | 34.66 | 108.98 | 31.97 | 61.01 | 14.6 |
Model 17 | 72.76 | 53.52 | 34.13 | 19.45 | 64.37 | 23.47 | 50.53 | 10.11 |
Model 18 | 78.73 | 61.63 | 42.35 | 25.94 | 80.19 | 27.25 | 55.25 | 12.38 |
Model 19 | 76.89 | 57.3 | 37.78 | 21.49 | 80.61 | 28.53 | 53.47 | 14.99 |
grit-faster-and-better-image-captioning | - | - | - | - | 105.9 | - | - | 13.6 |
Model 21 | 83.77 | 68.7 | 51.26 | 32.76 | 101.69 | 30.51 | 59.75 | 14.99 |
Model 22 | 80.7 | 63.27 | 42.86 | 25.78 | 84.83 | 27.23 | 55.91 | 12.06 |
Model 23 | 77.06 | 59.97 | 40.54 | 23.8 | 68.98 | 25.06 | 53.49 | 10.55 |
Model 24 | 74.35 | 55.97 | 36.12 | 20.84 | 70.33 | 25.1 | 52.26 | 11.07 |
Model 25 | 75.91 | 56.78 | 35.58 | 17.39 | 60.89 | 23.8 | 51.42 | 9.81 |
Model 26 | 76.49 | 56.2 | 33.73 | 15.14 | 62.96 | 23.68 | 50.84 | 10.13 |
vivo-surpassing-human-performance-in-novel | 86.33 | 72.83 | 55.94 | 37.97 | 112.82 | 32.7 | 62.48 | 15.22 |
clipcap-clip-prefix-for-image-captioning | - | - | - | - | 84.85 | - | - | 12.14 |
Model 29 | 77.68 | 60.34 | 41.5 | 24.57 | 74.27 | 26.04 | 54.42 | 11.46 |
vinvl-making-visual-representations-matter-in | 83.24 | 68.04 | 49.68 | 30.62 | 97.99 | 29.51 | 58.54 | 13.63 |
Model 31 | 76.12 | 57.98 | 38.44 | 21.92 | 67.91 | 25.07 | 52.53 | 10.87 |
Model 32 | 77.65 | 59.58 | 39.86 | 22.83 | 76.02 | 26.35 | 53.98 | 11.8 |
Model 33 | 72.05 | 52.89 | 31.92 | 16.71 | 53.34 | 22.04 | 49.64 | 9.16 |
clipcap-clip-prefix-for-image-captioning | - | - | - | - | 79.73 | - | - | 12.2 |
Model 35 | 80.5 | 64.48 | 46.46 | 29.59 | 88.08 | 28.7 | 56.84 | 13.04 |
Model 36 | 84.2 | 69.57 | 52.56 | 34.8 | 104.9 | 31.77 | 60.52 | 15.04 |
Model 37 | 80.26 | 63.94 | 44.65 | 27.23 | 87.21 | 27.7 | 56.4 | 12.28 |
Model 38 | 81.64 | 63.79 | 43.43 | 25.15 | 85.81 | 27.25 | 55.06 | 12.35 |
pali-a-jointly-scaled-multilingual-language | 88.02 | 75.21 | 59.38 | 41.16 | 121.09 | 34.22 | 64.39 | 15.69 |
Model 40 | 80.68 | 64.7 | 45.33 | 27.09 | 87.28 | 27.7 | 56.76 | 12.79 |
Model 41 | 81.86 | 67.2 | 50.5 | 34.11 | 99.9 | 31.61 | 59.54 | 15.17 |