HyperAIHyperAI
2 months ago

Timeception for Complex Action Recognition

Hussein, Noureldien ; Gavves, Efstratios ; Smeulders, Arnold W. M.
Timeception for Complex Action Recognition
Abstract

This paper focuses on the temporal aspect for recognizing human activities invideos; an important visual cue that has long been undervalued. We revisit theconventional definition of activity and restrict it to Complex Action: a set ofone-actions with a weak temporal pattern that serves a specific purpose.Related works use spatiotemporal 3D convolutions with fixed kernel size, toorigid to capture the varieties in temporal extents of complex actions, and tooshort for long-range temporal modeling. In contrast, we use multi-scaletemporal convolutions, and we reduce the complexity of 3D convolutions. Theoutcome is Timeception convolution layers, which reasons about minute-longtemporal patterns, a factor of 8 longer than best related works. As a result,Timeception achieves impressive accuracy in recognizing the human activities ofCharades, Breakfast Actions, and MultiTHUMOS. Further, we demonstrate thatTimeception learns long-range temporal dependencies and tolerate temporalextents of complex actions.

Timeception for Complex Action Recognition | Latest Papers | HyperAI