Privileged Knowledge Distillation for Online Action Detection

Online Action Detection (OAD) in videos is proposed as a per-frame labelingtask to address the real-time prediction tasks that can only obtain theprevious and current video frames. This paper presents a novellearning-with-privileged based framework for online action detection where thefuture frames only observable at the training stages are considered as a formof privileged information. Knowledge distillation is employed to transfer theprivileged information from the offline teacher to the online student. We notethat this setting is different from conventional KD because the differencebetween the teacher and student models mostly lies in input data rather thanthe network architecture. We propose Privileged Knowledge Distillation (PKD)which (i) schedules a curriculum learning procedure and (ii) inserts auxiliarynodes to the student model, both for shrinking the information gap andimproving learning performance. Compared to other OAD methods that explicitlypredict future frames, our approach avoids learning unpredictable unnecessaryyet inconsistent visual contents and achieves state-of-the-art accuracy on twopopular OAD benchmarks, TVSeries and THUMOS14.