Date

3 months ago

Size

25.58 GB

Organization

Paper URL

2507.07984

License

Non-Commercial

Tags

Benchmarks

OST-Bench, released in 2025 by the Shanghai Artificial Intelligence Laboratory in collaboration with Shanghai Jiao Tong University, the University of Hong Kong, and other institutions, is a dataset used to evaluate the online spatiotemporal scene understanding capabilities of multimodal large models. The related research paper is titled "OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene UnderstandingThe goal is to evaluate the comprehensive understanding capabilities of multimodal large models in online scene exploration, visible information modeling, and spatiotemporal reasoning tasks.

This dataset comprises approximately 1,400 real-world indoor 3D scenes, generating about 10,000 multi-turn temporal question-and-answer samples based on the scene exploration process. The scenes are sourced from ScanNet, ARKitScenes, and Matterport3D, and processed using unified 3D object and semantic annotations. A continuous viewpoint exploration trajectory is constructed within each scene, and corresponding question-and-answer content is generated based on accumulated visible information. The task design covers three core understanding directions: agent state, visible information, and agent-object spatial relationships, refined into 15 sub-tasks presented in a multi-turn dialogue format, requiring the model to perform online spatiotemporal reasoning based on historical observations and the current field of view.

OST-Bench.torrent

Seeding 1Downloading 0Completed 2Total Downloads 60

OST-Bench/
- README.md
  1.87 KB
- README.txt
  3.74 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.