Multimodal Reasoning On Rebus
评估指标
Accuracy
评测结果
各个模型在此基准测试上的表现结果
模型名称 | Accuracy | Paper Title | Repository |
---|---|---|---|
InstructBLIP | 0.6 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
BLIP2-FLAN-T5-XXL | 0.9 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
CogVLM | 0.9 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
LLaVa-1.5-13B | 1.8 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
Gemini Pro | 13.2 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
LLaVa-1.5-7B | 1.5 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
QWEN | 0.9 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | |
GPT-4V | 24.0 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols |
0 of 8 row(s) selected.