8 months ago

Video Understanding

Action Recognition

Multi-Task Learning

Method/Architecture

Computer Vision

Author Name

Abstract

The recently proposed action spotting task consists in finding the exacttimestamp in which an event occurs. This task fits particularly well for soccervideos, where events correspond to salient actions strictly defined by soccerrules (a goal occurs when the ball crosses the goal line). In this paper, wedevise a lightweight and modular network for action spotting, which cansimultaneously predict the event label and its temporal offset using the sameunderlying features. We enrich our model with two training strategies: thefirst one for data balancing and uniform sampling, the second for maskingambiguous frames and keeping the most discriminative visual cues. When testedon the SoccerNet dataset and using standard features, our full proposal exceedsthe current state of the art by 3 Average-mAP points. Additionally, it reachesa gain of more than 10 Average-mAP points on the test set when fine-tuned incombination with a strong 2D backbone.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Video Understanding

Action Recognition

Multi-Task Learning

Method/Architecture

Computer Vision

Author Name

Abstract

The recently proposed action spotting task consists in finding the exacttimestamp in which an event occurs. This task fits particularly well for soccervideos, where events correspond to salient actions strictly defined by soccerrules (a goal occurs when the ball crosses the goal line). In this paper, wedevise a lightweight and modular network for action spotting, which cansimultaneously predict the event label and its temporal offset using the sameunderlying features. We enrich our model with two training strategies: thefirst one for data balancing and uniform sampling, the second for maskingambiguous frames and keeping the most discriminative visual cues. When testedon the SoccerNet dataset and using standard features, our full proposal exceedsthe current state of the art by 3 Average-mAP points. Additionally, it reachesa gain of more than 10 Average-mAP points on the test set when fine-tuned incombination with a strong 2D backbone.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp