Harvard-GF3300 Retinal Neurological Disease (Glaucoma) Dataset

This dataset is a retinal neurological disease (glaucoma) dataset of 3,300 subjects, containing 2D and 3D image data, and balanced in sample size across racial groups for glaucoma detection. Glaucoma is the leading cause of irreversible blindness worldwide, and the prevalence of glaucoma in blacks is twice that of other races.
The Harvard-GF dataset aims to promote fairness in AI-automated glaucoma diagnosis, focusing on the retinal nerve fiber layer (RNFL), as glaucoma is the leading cause of irreversible blindness worldwide. This dataset addresses some of the main challenges currently facing the field of fair learning, including the limited number and quality of public datasets, especially the lack of datasets suitable for creating fair computer vision models that require imaging data, and the lack of fair datasets in the medical and health fields.
The main features of the Harvard-GF dataset include:
- It is the first fairness dataset dedicated to deep learning research in medical imaging.
- The dataset contains an equal number of subjects from the three major racial groups (white, black, and Asian), which avoids data imbalance issues that could confound the issue of fair learning.
- Both 2D and 3D imaging data are available, which provides an underexplored research opportunity for 3D fair learning.