HyperAI

MM-RLHF Multimodal Preference Alignment Dataset

Date

3 months ago

Size

55.33 GB

Organization

License

Apache 2.0

MM-RLHF (Multimodal Reinforcement Learning from Human Feedback) is a high-quality, fine-grained multimodal dataset.MM-RLHF: The Next Step Forward in Multimodal LLM Alignment", first published on arXiv in 2025 by the Institute of Automation, Chinese Academy of Sciences (CASIA). This dataset aims to promote the alignment research of multimodal large language models (MLLMs) and solve the problems of truthfulness, safety, and alignment with human preferences in practical applications.

The dataset contains 120,000 pairs of fine-grained, manually annotated preference comparison data, covering three areas: image understanding, video analysis, and multimodal security. The amount of data far exceeds existing resources, covering more than 100,000 multimodal task instances. Each piece of data has been carefully scored and interpreted by more than 50 annotators to ensure the high quality and granularity of the data.

Dataset Example

MM-RLHF.torrent
Seeding 1Downloading 2Completed 34Total Downloads 34
  • MM-RLHF/
    • README.md
      1.55 KB
    • README.txt
      3.09 KB
      • data/
        • MM-RLHF.zip
          55.33 GB