Multimodal Reasoning On Rebus
評価指標
Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
モデル名 | Accuracy | Paper Title | Repository |
---|---|---|---|
InstructBLIP | 0.6 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
BLIP2-FLAN-T5-XXL | 0.9 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
CogVLM | 0.9 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
LLaVa-1.5-13B | 1.8 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
Gemini Pro | 13.2 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
LLaVa-1.5-7B | 1.5 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
QWEN | 0.9 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
GPT-4V | 24.0 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols |
0 of 8 row(s) selected.