Visual Question Answering Vqa On Ai2D
評価指標
EM
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
モデル名 | EM | Paper Title | Repository |
---|---|---|---|
DUBLIN | 51.11 | DUBLIN -- Document Understanding By Language-Image Network | - |
Gemini Ultra | 79.5 | Gemini: A Family of Highly Capable Multimodal Models | |
SMoLA-PaLI-X Specialist Model | 82.5 | Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts | - |
SMoLA-PaLI-X Generalist Model | 81.4 | Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts | - |
0 of 4 row(s) selected.