Command Palette
Search for a command to run...
Efficient Two-Step Networks for Temporal Action Segmentation
Efficient Two-Step Networks for Temporal Action Segmentation
Shenglan Liu YuHan Wang Li Xu Jie Zhu Lianyu Hu Lin Feng Kaiyuan Liu Zhuben Dong Yunheng Li
Abstract
Due to boundary ambiguity and over-segmentation issues, identifying all the frames in long untrimmed videos is still challenging. To address these problems, we present the Efficient Two-Step Network (ETSN) with two components. The first step of ETSN is Efficient Temporal Series Pyramid Networks (ETSPNet) that capture both local and global frame-level features and provide accurate predictions of segmentation boundaries. The second step is a novel unsupervised approach called Local Burr Suppression (LBS), which significantly reduces the over-segmentation errors. Our empirical evaluations on the benchmarks including 50Salads, GTEA and Breakfast dataset demonstrate that ETSN outperforms the current state-of-the-art methods by a large margin.