HyperAIHyperAI
2 months ago

Improved Soccer Action Spotting using both Audio and Video Streams

Vanderplaetse, Bastien ; Dupont, Stéphane
Improved Soccer Action Spotting using both Audio and Video Streams
Abstract

In this paper, we propose a study on multi-modal (audio and video) actionspotting and classification in soccer videos. Action spotting andclassification are the tasks that consist in finding the temporal anchors ofevents in a video and determine which event they are. This is an importantapplication of general activity understanding. Here, we propose an experimentalstudy on combining audio and video information at different stages of deepneural network architectures. We used the SoccerNet benchmark dataset, whichcontains annotated events for 500 soccer game videos from the Big Five Europeanleagues. Through this work, we evaluated several ways to integrate audio streaminto video-only-based architectures. We observed an average absoluteimprovement of the mean Average Precision (mAP) metric of $7.43\%$ for theaction classification task and of $4.19\%$ for the action spotting task.

Improved Soccer Action Spotting using both Audio and Video Streams | Latest Papers | HyperAI