Medical AI Poses Disparate Privacy Risks to Patients
New research reveals that medical artificial intelligence models pose disparate privacy risks to patients, with membership inference attacks disproportionately compromising individuals from underrepresented groups. While AI offers significant diagnostic potential, these findings demonstrate that current privacy assessment methods fail to capture the granular threats faced by specific patients. The study introduces a novel methodology for evaluating membership inference attack success at the patient level, moving beyond traditional aggregate metrics. Using state-of-the-art Likelihood Ratio attacks, researchers analyzed seven large clinical datasets spanning electronic health records, medical imaging, and electrocardiograms. The analysis exposed a critical flaw in standard reporting: aggregate attack success rates often appear benign, masking severe vulnerabilities for individual data contributors. For a distinct subset of patients, the probability of an attacker confirming their inclusion in a training dataset approaches certainty. The risk distribution is highly unequal. Privacy audits reveal that some patients face near-perfect attack success while others remain largely unaffected. This disparity correlates strongly with demographic and clinical factors. Underrepresented groups are overrepresented among the most vulnerable records. In electronic health record models, records from Black patients, those with Medicaid insurance, and patients diagnosed with cancer showed substantially higher vulnerability than the general cohort. Similarly, in mammography datasets, rare breast density categories and specific tumor findings exhibited disproportionate risk, suggesting that data scarcity drives heightened exposure. Model scaling significantly amplifies these inequities. As models increase in capacity to achieve higher diagnostic performance, the number of patients facing extreme privacy risks can increase by orders of magnitude. Larger architectures, including vision transformers, raised the proportion of highly vulnerable patients compared to smaller residual networks. This indicates a tangible trade-off between performance improvements and individual privacy preservation, particularly for rare conditions. Demonstrating practical feasibility, researchers executed low-cost offline attacks against open-source chest radiograph models using minimal computational resources. The results confirm that untrusted users can exploit publicly available medical AI models to infer sensitive patient information without complex assumptions or extensive computational power. Differential privacy emerges as a robust mitigation strategy. Applying differential privacy protections reduced attack success across all patient groups. However, the study indicates that standard record-level privacy accounting may be insufficient; effectively mitigating risks likely requires patient-level differential privacy accounting to ensure individuals contributing multiple records remain protected. The authors recommend a shift in reporting standards from aggregate metrics to patient-level risk assessments and urge the integration of verifiable privacy protections into medical AI development workflows to prevent the exacerbation of health inequalities.
