Image Captioning On Nocaps Xd Near Domain
평가 지표
B1
B2
B3
B4
CIDEr
METEOR
ROUGE-L
SPICE
평가 결과
이 벤치마크에서 각 모델의 성능 결과
모델 이름 | B1 | B2 | B3 | B4 | CIDEr | METEOR | ROUGE-L | SPICE | Paper Title | Repository |
---|---|---|---|---|---|---|---|---|---|---|
Neural Baby Talk | 73.69 | 54.1 | 32.37 | 15.99 | 53.21 | 21.93 | 49.63 | 9.26 | - | - |
Microsoft Cognitive Services team | 82.88 | 67.01 | 48.73 | 30.21 | 101.2 | 30.0 | 58.76 | 14.27 | VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning | - |
GIT2 | 88.9 | 75.86 | 58.9 | 38.95 | 125.51 | 32.95 | 63.66 | 16.11 | GIT: A Generative Image-to-text Transformer for Vision and Language | |
UpDown | 75.25 | 56.93 | 36.91 | 20.49 | 56.85 | 23.6 | 51.84 | 10.33 | - | - |
Neural Baby Talk + CBS | 74.77 | 53.67 | 30.66 | 13.85 | 61.98 | 22.55 | 49.45 | 9.83 | - | - |
GIT | 88.56 | 75.48 | 58.46 | 38.44 | 123.92 | 32.86 | 63.5 | 15.96 | GIT: A Generative Image-to-text Transformer for Vision and Language | |
test_cbs2 | 79.88 | 61.31 | 40.26 | 21.84 | 85.81 | 27.0 | 53.98 | 13.01 | - | - |
UpDown + ELMo + CBS | 77.68 | 58.31 | 37.04 | 19.85 | 74.2 | 24.97 | 52.64 | 11.45 | - | - |
VLAF2 | 84.45 | 69.28 | 51.1 | 31.48 | 104.76 | 30.31 | 59.75 | 14.97 | - | - |
icp2ssi1_coco_si_0.02_5_test | 79.51 | 62.65 | 43.22 | 24.97 | 85.73 | 26.37 | 55.13 | 11.96 | - | - |
Human | 77.05 | 56.97 | 36.84 | 19.85 | 84.58 | 28.42 | 53.06 | 14.72 | - | - |
0 of 11 row(s) selected.