Image Captioning On Flickr30K Captions Test
Metriken
CIDEr
SPICE
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | CIDEr | SPICE |
---|---|---|
language-models-are-general-purpose | 43.3 | 11.7 |
unified-vision-language-pre-training-for | 67.4 | 17 |
a-good-prompt-is-worth-millions-of-parameters | 31.0 | 10.0 |
Modell 4 | 67.1 | 14.5 |
unifying-vision-and-language-tasks-via-text | 2.6 | 2.0 |
paying-more-attention-to-saliency-image | 46.4 | - |
deep-visual-semantic-alignments-for | 24.7 | - |