HyperAI

The dataset is the multimodal benchmark test MULTI released by Shanghai Jiao Tong University, which aims to evaluate the ability of large multimodal models to understand complex tables and images and to reason about long texts. The test provides multimodal inputs and requires answers to be precise or open-ended, reflecting the style of real-life exams. MULTI contains more than 18,000 questions, covering a variety of tasks from formula derivation to image analysis and cross-modal reasoning.

The research team also created MULTI-Elite, a carefully selected subset of difficult problems containing 500 questions, and MULTI-Extend, a dataset of more than 4,500 external knowledge contexts. MULTI not only serves as a robust evaluation platform, but also points the way for the development of expert-level AI.

MULTI-Benchmark: A Leaderboard for Multimodal Understanding With Text and Images

Build AI with AI

Hyper Newsletters

Command Palette

MULTI-Benchmark: A Leaderboard for Multimodal Understanding With Text and Images

Build AI with AI

Hyper Newsletters