2 months ago
Long-Term Feature Banks for Detailed Video Understanding
Wu, Chao-Yuan ; Feichtenhofer, Christoph ; Fan, Haoqi ; He, Kaiming ; Krähenbühl, Philipp ; Girshick, Ross

Abstract
To understand the world, we humans constantly need to relate the present tothe past, and put events in context. In this paper, we enable existing videomodels to do the same. We propose a long-term feature bank---supportiveinformation extracted over the entire span of a video---to augmentstate-of-the-art video models that otherwise would only view short clips of 2-5seconds. Our experiments demonstrate that augmenting 3D convolutional networkswith a long-term feature bank yields state-of-the-art results on threechallenging video datasets: AVA, EPIC-Kitchens, and Charades.