HyperAIHyperAI

HPDv3 Human Preference Dataset

Date

21 days ago

Size

16.23 GB

Organization

CUHK MMLab (The Chinese University of Hong Kong Multimedia Laboratory)

Publish URL

huggingface.co

Paper URL

arxiv.org

License

MIT

HPDv3 was proposed by the Mizzen AI research team in collaboration with the Multimedia Laboratory (MMLab) of the Chinese University of Hong Kong, King's College London and other author teams. It was released in 2025 and is the first broad-spectrum human preference dataset for multiple fields. "HPSv3: Towards Wide-Spectrum Human Preference Score", and has been selected for ICCV 2025. This dataset is aimed at the alignment, rearrangement and evaluation of text-to-image generation models, aiming to promote the progress of models in approaching human aesthetics and improving semantic consistency.

The dataset contains 1.08 million text-image pairs and 1.17 million annotated paired comparison data, covering high-quality and low-quality real photos with rich annotation information. The training set has approximately 1.14 million items and the test set has approximately 14,400 items, which is suitable for characterizing a wide range of human preferences.

The data includes:

  • Text: prompt (English)
  • Paired image paths: path1, path2 (aligned with the paths after unzipping the image package)
  • Model sources: model1, model2
  • Preference annotation: choice_dist (voting distribution, can be empty), confidence (confidence, can be empty)
  • Convention: path1 always corresponds to the more preferred image
Dataset Example

HPDv3.torrent
Seeding 1Downloading 0Completed 6Total Downloads 29
  • HPDv3/
    • README.md
      1.89 KB
    • README.txt
      3.79 KB
      • data/
        • HPDv3.zip
          16.23 GB
HPDv3 Human Preference Dataset | Datasets | HyperAI