MMPR-v1.2-Prompts Multimodal Reasoning Prompts Dataset
Date
Publish URL
Paper URL
License
MIT
MMPR-v1.2-Prompts is a collection of prompt corpora for multimodal reasoning preference learning, released in 2024 by Shanghai Artificial Intelligence Laboratory in collaboration with Tsinghua University, Fudan University and other institutions. The related paper results are "Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization", which aims to support the training and evaluation of models in complex visual-language reasoning tasks.
Data Structure
The dataset contains approximately 3 million samples, each of which is a natural language text prompt, some of which contain multimodal constraints (such as the requirement to combine image and text reasoning):
- Instruction/Prompt: Expressed in natural language, covering multimodal reasoning scenarios such as visual question answering, graph-text reasoning, and scene understanding.
- Input context: In some tasks, it contains images, text, or a combination of the two to constrain the model to generate outputs.
- Output Format: The format of the answer specified in the prompt, such as "Chain-of-Thought", "Multiple-Choice Reasons", "Explanatory Output", etc.
It should be noted that the dataset itself does not contain the answers or preference labeling results generated by the model, but serves as the starting point for data generation, providing input prompts for the subsequent construction of multimodal preference ranking data (MMPR dataset).