HyperAIHyperAI

Command Palette

Search for a command to run...

MMPR Multimodal Reasoning Preference Dataset

Date

a year ago

Size

29.29 GB

Organization

Nanjing University
Shanghai Artificial Intelligence Laboratory
Fudan University

Publish URL

github.com

Paper URL

arxiv.org

MMPR (Multimodal Preference Dataset) is a large-scale multimodal preference dataset jointly released in 2024 by research teams from Shanghai Artificial Intelligence Laboratory, Fudan University, Nanjing University, Chinese University of Hong Kong, Tsinghua University and SenseTime. The related paper results are "Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization". The dataset contains 750,000 samples without clear correct answers and 2.5 million samples with clear correct answers. The samples cover multiple fields such as VQA, science, diagrams, mathematics, OCR, and documents to ensure diversity. When constructing the dataset, the researchers paid special attention to avoiding false positive negative responses due to the limitations of heuristic rules, especially in the fields of general VQA and documents. The dataset is designed to improve the performance of the model in multimodal reasoning tasks while avoiding potential negative effects during training.

Example of data from MMPR. For instructions with a clear correct answer, the research team proposed a correctness-based process that samples multiple solutions and considers those with correct answers as selected responses and those with incorrect answers as rejected responses. For instructions without a clear correct answer, the research team proposed using DropoutNTP to generate rejected responses. The difference between selected and rejected responses is emphasized in italic text. Red highlights indicate incorrect responses.

MMPR-OpenGVLab.torrent
Seeding 1Downloading 0Completed 129Total Downloads 171
  • MMPR-OpenGVLab/
    • README.md
      2.12 KB
    • README.txt
      4.25 KB
      • data/
        • MMPR.zip
          14.63 GB
          • MMPR/
            • README.md
              14.63 GB
            • annotations.zip
              16.03 GB
            • images.zip
              29.29 GB
            • meta.json
              29.29 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp