HyperAIHyperAI
2 months ago

Learning Latent Sub-events in Activity Videos Using Temporal Attention Filters

Piergiovanni, AJ ; Fan, Chenyou ; Ryoo, Michael S.
Learning Latent Sub-events in Activity Videos Using Temporal Attention
  Filters
Abstract

In this paper, we newly introduce the concept of temporal attention filters,and describe how they can be used for human activity recognition from videos.Many high-level activities are often composed of multiple temporal parts (e.g.,sub-events) with different duration/speed, and our objective is to make themodel explicitly learn such temporal structure using multiple attention filtersand benefit from them. Our temporal filters are designed to be fullydifferentiable, allowing end-of-end training of the temporal filters togetherwith the underlying frame-based or segment-based convolutional neural networkarchitectures. This paper presents an approach of learning a set of optimalstatic temporal attention filters to be shared across different videos, andextends this approach to dynamically adjust attention filters per testing videousing recurrent long short-term memory networks (LSTMs). This allows ourtemporal attention filters to learn latent sub-events specific to eachactivity. We experimentally confirm that the proposed concept of temporalattention filters benefits the activity recognition, and we visualize thelearned latent sub-events.

Learning Latent Sub-events in Activity Videos Using Temporal Attention Filters | Latest Papers | HyperAI