2 months ago

Regularizing cross entropy loss via minimum entropy and K-L divergence

Ibraheem, Abdulrahman Oladipupo

Abstract

I introduce two novel loss functions for classification in deep learning. Thetwo loss functions extend standard cross entropy loss by regularizing it withminimum entropy and Kullback-Leibler (K-L) divergence terms. The first of thetwo novel loss functions is termed mixed entropy loss (MIX-ENT for short),while the second one is termed minimum entropy regularized cross-entropy loss(MIN-ENT for short). The MIX-ENT function introduces a regularizer that can beshown to be equivalent to the sum of a minimum entropy term and a K-Ldivergence term. However, it should be noted that the K-L divergence term hereis different from that in the standard cross-entropy loss function, in thesense that it swaps the roles of the target probability and the hypothesisprobability. The MIN-ENT function simply adds a minimum entropy regularizer tothe standard cross entropy loss function. In both MIX-ENT and MIN-ENT, theminimum entropy regularizer minimizes the entropy of the hypothesis probabilitydistribution which is output by the neural network. Experiments on theEMNIST-Letters dataset shows that my implementation of MIX-ENT and MIN-ENT letsthe VGG model climb from its previous 3rd position on the paperswithcodeleaderboard to reach the 2nd position on the leaderboard, outperforming theSpinal-VGG model in so doing. Specifically, using standard cross-entropy, VGGachieves 95.86% while Spinal-VGG achieves 95.88% classification accuracies,whereas using VGG (without Spinal-VGG) our MIN-ENT achieved 95.933%, while ourMIX-ENT achieved 95.927% accuracies. The pre-trained models for both MIX-ENTand MIN-ENT are at https://github.com/rahmanoladi/minimum entropy project.