Discuss on Discord

Date

a year ago

Size

29.29 GB

Organization

Publish URL

Paper URL

Tags

Preference Modeling

Multimodal Representation

Visual Question Answering

MMPR (Multimodal Preference Dataset) is a large-scale multimodal preference dataset jointly released in 2024 by research teams from Shanghai Artificial Intelligence Laboratory, Fudan University, Nanjing University, Chinese University of Hong Kong, Tsinghua University and SenseTime. The related paper results are "Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization". The dataset contains 750,000 samples without clear correct answers and 2.5 million samples with clear correct answers. The samples cover multiple fields such as VQA, science, diagrams, mathematics, OCR, and documents to ensure diversity. When constructing the dataset, the researchers paid special attention to avoiding false positive negative responses due to the limitations of heuristic rules, especially in the fields of general VQA and documents. The dataset is designed to improve the performance of the model in multimodal reasoning tasks while avoiding potential negative effects during training.

Example of data from MMPR. For instructions with a clear correct answer, the research team proposed a correctness-based process that samples multiple solutions and considers those with correct answers as selected responses and those with incorrect answers as rejected responses. For instructions without a clear correct answer, the research team proposed using DropoutNTP to generate rejected responses. The difference between selected and rejected responses is emphasized in italic text. Red highlights indicate incorrect responses.

MMPR-OpenGVLab.torrent

Seeding 1Downloading 0Completed 140Total Downloads 212

MMPR-OpenGVLab/
- README.md
  2.12 KB
- README.txt
  4.25 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

Discuss on Discord

Date

a year ago

Size

29.29 GB

Organization

Publish URL

Paper URL

arxiv.org

Tags

Preference Modeling

Multimodal Representation

Visual Question Answering

MMPR (Multimodal Preference Dataset) is a large-scale multimodal preference dataset jointly released in 2024 by research teams from Shanghai Artificial Intelligence Laboratory, Fudan University, Nanjing University, Chinese University of Hong Kong, Tsinghua University and SenseTime. The related paper results are "Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization". The dataset contains 750,000 samples without clear correct answers and 2.5 million samples with clear correct answers. The samples cover multiple fields such as VQA, science, diagrams, mathematics, OCR, and documents to ensure diversity. When constructing the dataset, the researchers paid special attention to avoiding false positive negative responses due to the limitations of heuristic rules, especially in the fields of general VQA and documents. The dataset is designed to improve the performance of the model in multimodal reasoning tasks while avoiding potential negative effects during training.

Example of data from MMPR. For instructions with a clear correct answer, the research team proposed a correctness-based process that samples multiple solutions and considers those with correct answers as selected responses and those with incorrect answers as rejected responses. For instructions without a clear correct answer, the research team proposed using DropoutNTP to generate rejected responses. The difference between selected and rejected responses is emphasized in italic text. Red highlights indicate incorrect responses.

MMPR-OpenGVLab.torrent

Seeding 1Downloading 0Completed 140Total Downloads 212

MMPR-OpenGVLab/
- README.md
  2.12 KB
- README.txt
  4.25 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp