Spatial Reasoning On Embspatial Bench
Metrics
Generation
Results
Performance results of various models on this benchmark
Model Name | Generation | Paper Title | Repository |
---|---|---|---|
GPT-4V | 36.07 | GPT-4 Technical Report | |
Qwen-VL-Max | 49.11 | Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond | |
LLaVA-1.6 | 35.19 | Visual Instruction Tuning | |
MiniGPT4 | 23.54 | MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models | |
SoFar | 70.88 | SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation |
0 of 5 row(s) selected.