HyperAI

OmniSpatial is a panoramic spatial reasoning benchmark dataset released in 2025 by Tsinghua University, Shanghai Institute of Intelligence, Shanghai Artificial Intelligence Laboratory and other institutions. The related paper results are "OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models》, which aims to fill the gap in the evaluation of vision-language model space understanding.

This dataset contains approximately 1,533 image-question-answering examples, covering four major categories of spatial reasoning tasks: dynamic reasoning, complex spatial logic, spatial interaction, and perspective taking, with a total of 50 subtasks. The data comes from diverse sources, including internet images, psychology tests, and driving test questions. The annotations have undergone multiple rounds of review to ensure quality and diversity. Compared to traditional benchmarks, OmniSpatial avoids template-based construction and is more realistic and complex. It not only tests basic spatial relationships (such as front-back, left-right, and near-far), but also emphasizes multi-object interactions, scene changes, and cross-viewpoint reasoning.

This dataset is suitable for training and evaluating the spatial reasoning capabilities of large multimodal models, especially in applications such as intelligent navigation, augmented/virtual reality, and complex scene understanding. It is a comprehensive and challenging standardized benchmark dataset.

OmniSpatial Panoramic Spatial Reasoning Benchmark Dataset

Build AI with AI

Hyper Newsletters

Command Palette

OmniSpatial Panoramic Spatial Reasoning Benchmark Dataset

Build AI with AI

Hyper Newsletters