Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video

Abnormal event detection in video is a challenging vision problem. Mostexisting approaches formulate abnormal event detection as an outlier detectiontask, due to the scarcity of anomalous data during training. Because of thelack of prior information regarding abnormal events, these methods are notfully-equipped to differentiate between normal and abnormal events. In thiswork, we formalize abnormal event detection as a one-versus-rest binaryclassification problem. Our contribution is two-fold. First, we introduce anunsupervised feature learning framework based on object-centric convolutionalauto-encoders to encode both motion and appearance information. Second, wepropose a supervised classification approach based on clustering the trainingsamples into normality clusters. A one-versus-rest abnormal event classifier isthen employed to separate each normality cluster from the rest. For the purposeof training the classifier, the other clusters act as dummy anomalies. Duringinference, an object is labeled as abnormal if the highest classification scoreassigned by the one-versus-rest classifiers is negative. Comprehensiveexperiments are performed on four benchmarks: Avenue, ShanghaiTech, UCSD andUMN. Our approach provides superior results on all four data sets. On thelarge-scale ShanghaiTech data set, our method provides an absolute gain of 8.4%in terms of frame-level AUC compared to the state-of-the-art method [Sultani etal., CVPR 2018].