What's in the Flow? Exploiting Temporal Motion Cues for Unsupervised Generic Event Boundary Detection

Generic Event Boundary Detection (GEBD) task aims to recognize generic,taxonomy-free boundaries that segment a video into meaningful events. Currentmethods typically involve a neural model trained on a large volume of data,demanding substantial computational power and storage space. We explore twopivotal questions pertaining to GEBD: Can non-parametric algorithms outperformunsupervised neural methods? Does motion information alone suffice for highperformance? This inquiry drives us to algorithmically harness motion cues foridentifying generic event boundaries in videos. In this work, we proposeFlowGEBD, a non-parametric, unsupervised technique for GEBD. Our approachentails two algorithms utilizing optical flow: (i) Pixel Tracking and (ii) FlowNormalization. By conducting thorough experimentation on the challengingKinetics-GEBD and TAPOS datasets, our results establish FlowGEBD as the newstate-of-the-art (SOTA) among unsupervised methods. FlowGEBD exceeds the neuralmodels on the Kinetics-GEBD dataset by obtaining an [email protected] score of 0.713 withan absolute gain of 31.7% compared to the unsupervised baseline and achieves anaverage F1 score of 0.623 on the TAPOS validation dataset.