HyperAIHyperAI
11 days ago

Efficient Two-Step Networks for Temporal Action Segmentation

{Shenglan Liu, YuHan Wang, Li Xu, Jie Zhu, Lianyu Hu, Lin Feng, Kaiyuan Liu, Zhuben Dong, Yunheng Li}
Abstract

Due to boundary ambiguity and over-segmentation issues, identifying all the frames in long untrimmed videos is still challenging. To address these problems, we present the Efficient Two-Step Network (ETSN) with two components. The first step of ETSN is Efficient Temporal Series Pyramid Networks (ETSPNet) that capture both local and global frame-level features and provide accurate predictions of segmentation boundaries. The second step is a novel unsupervised approach called Local Burr Suppression (LBS), which significantly reduces the over-segmentation errors. Our empirical evaluations on the benchmarks including 50Salads, GTEA and Breakfast dataset demonstrate that ETSN outperforms the current state-of-the-art methods by a large margin.

Efficient Two-Step Networks for Temporal Action Segmentation | Latest Papers | HyperAI