PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation

Deep neural networks have rapidly become the mainstream method for facerecognition (FR). However, this limits the deployment of such models thatcontain an extremely large number of parameters to embedded and low-enddevices. In this work, we present an extremely lightweight and accurate FRsolution, namely PocketNet. We utilize neural architecture search to develop anew family of lightweight face-specific architectures. We additionally proposea novel training paradigm based on knowledge distillation (KD), the multi-stepKD, where the knowledge is distilled from the teacher model to the studentmodel at different stages of the training maturity. We conduct a detailedablation study proving both, the sanity of using NAS for the specific task ofFR rather than general object classification, and the benefits of our proposedmulti-step KD. We present an extensive experimental evaluation and comparisonswith the state-of-the-art (SOTA) compact FR models on nine different benchmarksincluding large-scale evaluation benchmarks such as IJB-B, IJB-C, and MegaFace.PocketNets have consistently advanced the SOTA FR performance on ninemainstream benchmarks when considering the same level of model compactness.With 0.92M parameters, our smallest network PocketNetS-128 achieved verycompetitive results to recent SOTA compacted models that contain up to 4Mparameters.