Toward Training at ImageNet Scale with Differential Privacy

Differential privacy (DP) is the de facto standard for training machinelearning (ML) models, including neural networks, while ensuring the privacy ofindividual examples in the training set. Despite a rich literature on how totrain ML models with differential privacy, it remains extremely challenging totrain real-life, large neural networks with both reasonable accuracy andprivacy. We set out to investigate how to do this, using ImageNet image classificationas a poster example of an ML task that is very challenging to resolveaccurately with DP right now. This paper shares initial lessons from oureffort, in the hope that it will inspire and inform other researchers toexplore DP training at scale. We show approaches that help make DP trainingfaster, as well as model types and settings of the training process that tendto work better in the DP setting. Combined, the methods we discuss let us traina Resnet-18 with DP to $47.9\%$ accuracy and privacy parameters $\epsilon = 10,\delta = 10^{-6}$. This is a significant improvement over "naive" DP trainingof ImageNet models, but a far cry from the $75\%$ accuracy that can be obtainedby the same network without privacy. The model we use was pretrained on thePlaces365 data set as a starting point. We share our code athttps://github.com/google-research/dp-imagenet, calling for others to buildupon this new baseline to further improve DP at scale.