Image Captioning On Nocaps Out Of Domain
Metriken
CIDEr
SPICE
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | CIDEr | SPICE |
---|---|---|
clipcap-clip-prefix-for-image-captioning | 49.14 | 9.57 |
Modell 2 | 21.3 | 7.2 |
clipcap-clip-prefix-for-image-captioning | 49.35 | 9.7 |
Modell 4 | 72.13 | 11.53 |
Modell 5 | 30.09 | 8.08 |
Modell 6 | 26.55 | 7.72 |
Modell 7 | 58.48 | 8.77 |
Modell 8 | 30.09 | 8.08 |
vivo-surpassing-human-performance-in-novel | 110.14 | 13.74 |
Modell 10 | 71.43 | 10.57 |
Modell 11 | 48.73 | 8.2 |
Modell 12 | 88.54 | 13.87 |
Modell 13 | 70.21 | 10.15 |
Modell 14 | 103.75 | 13.75 |
Modell 15 | 85.18 | 11.18 |
Modell 16 | 68.92 | 10.05 |
Modell 17 | 87.51 | 12.52 |
Modell 18 | 77.39 | 11.59 |
Modell 19 | 23.07 | 7.4 |
Modell 20 | 68.5 | 10.01 |
Modell 21 | 54.56 | 9.9 |
git-a-generative-image-to-text-transformer | 122.27 | 15.62 |
vinvl-making-visual-representations-matter-in | 78.01 | 11.48 |
simvlm-simple-visual-language-model | 109.49 | 13.89 |
Modell 25 | 26.25 | 7.52 |
Modell 26 | 91.62 | 14.21 |
Modell 27 | 87.15 | 11.43 |
Modell 28 | 121.69 | 15.13 |
Modell 29 | 36.12 | 9.39 |
git-a-generative-image-to-text-transformer | 122.04 | 15.7 |
Modell 31 | 39.39 | 7.62 |
Modell 32 | 75.39 | 10.68 |
Modell 33 | 66.67 | 9.74 |
Modell 34 | 43.2 | 9.35 |
Modell 35 | 78.91 | 12.14 |
Modell 36 | 25.91 | 7.61 |
Modell 37 | 73.75 | 9.72 |
pali-a-jointly-scaled-multilingual-language | 126.67 | 15.49 |
Modell 39 | 106.55 | 14.21 |
grit-faster-and-better-image-captioning | 72.6 | 11.1 |