HPDv3 Human Preference Dataset
Date
Size
Publish URL
Paper URL
License
MIT
HPDv3 was proposed by the Mizzen AI research team in collaboration with the Multimedia Laboratory (MMLab) of the Chinese University of Hong Kong, King's College London and other author teams. It was released in 2025 and is the first broad-spectrum human preference dataset for multiple fields. "HPSv3: Towards Wide-Spectrum Human Preference Score", and has been selected for ICCV 2025. This dataset is aimed at the alignment, rearrangement and evaluation of text-to-image generation models, aiming to promote the progress of models in approaching human aesthetics and improving semantic consistency.
The dataset contains 1.08 million text-image pairs and 1.17 million annotated paired comparison data, covering high-quality and low-quality real photos with rich annotation information. The training set has approximately 1.14 million items and the test set has approximately 14,400 items, which is suitable for characterizing a wide range of human preferences.
The data includes:
- Text: prompt (English)
- Paired image paths: path1, path2 (aligned with the paths after unzipping the image package)
- Model sources: model1, model2
- Preference annotation: choice_dist (voting distribution, can be empty), confidence (confidence, can be empty)
- Convention: path1 always corresponds to the more preferred image
