Visual Question Answering On Vizwiz 2020 Vqa
评估指标
number
other
overall
unanswerable
yes/no
评测结果
各个模型在此基准测试上的表现结果
模型名称 | number | other | overall | unanswerable | yes/no | Paper Title | Repository |
---|---|---|---|---|---|---|---|
sudoku | 26.83 | 42.29 | 55.93 | 88.95 | 73.45 | - | - |
Katya | 27.37 | 40.92 | 54.76 | 86.82 | 80.52 | - | - |
e50 | 18.16 | 28.88 | 44.9 | 84.13 | 60.08 | - | - |
Video-LaVIT | - | - | 56.0 | - | - | Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization | |
VWTest1 | 14.09 | 17.57 | 34.13 | 78.2 | 25.31 | - | - |
knight777 | 17.34 | 27.34 | 44.01 | 85.86 | 53.01 | - | - |
PaLI | - | - | 73.3 | - | - | PaLI: A Jointly-Scaled Multilingual Language-Image Model | |
pk | 18.7 | 26.13 | 41.92 | 81.54 | 49.86 | - | - |
BERT-RG | 2.71 | 1.21 | 6.25 | 7.13 | 79.85 | - | - |
CLIP-Ensemble | - | - | 61.64 | - | - | Less Is More: Linear Layers on CLIP Features as Powerful VizWiz Model | - |
CLIP-Single | - | - | 60.66 | - | - | Less Is More: Linear Layers on CLIP Features as Powerful VizWiz Model | - |
HSSLab | 27.1 | 42.3 | 56.33 | 89.49 | 78.89 | - | - |
shaunakh | 22.22 | 34.21 | 48.39 | 83.43 | 60.65 | - | - |
Modified Attention | 20.6 | 34.14 | 49.58 | 88.26 | 59.79 | - | - |
Tartans | 23.04 | 19.05 | 34.96 | 71.45 | 60.08 | - | - |
SKP | 18.97 | 28.12 | 44.62 | 84.32 | 63.8 | - | - |
0 of 16 row(s) selected.