MULTI-Benchmark: A Leaderboard for Multimodal Understanding With Text and Images
Date
Size
Publish URL
Categories
The dataset is the multimodal benchmark test MULTI released by Shanghai Jiao Tong University, which aims to evaluate the ability of large multimodal models to understand complex tables and images and to reason about long texts. The test provides multimodal inputs and requires answers to be precise or open-ended, reflecting the style of real-life exams. MULTI contains more than 18,000 questions, covering a variety of tasks from formula derivation to image analysis and cross-modal reasoning.
The research team also created MULTI-Elite, a carefully selected subset of difficult problems containing 500 questions, and MULTI-Extend, a dataset of more than 4,500 external knowledge contexts. MULTI not only serves as a robust evaluation platform, but also points the way for the development of expert-level AI.