HyperAIHyperAI

Command Palette

Search for a command to run...

FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment

Jinglin Xu Sibo Yin Guohao Zhao Zishuo Wang Yuxin Peng

Abstract

Existing action quality assessment (AQA) methods mainly learn deeprepresentations at the video level for scoring diverse actions. Due to the lackof a fine-grained understanding of actions in videos, they harshly suffer fromlow credibility and interpretability, thus insufficient for stringentapplications, such as Olympic diving events. We argue that a fine-grainedunderstanding of actions requires the model to perceive and parse actions inboth time and space, which is also the key to the credibility andinterpretability of the AQA technique. Based on this insight, we propose a newfine-grained spatial-temporal action parser named \textbf{FineParser}. Itlearns human-centric foreground action representations by focusing on targetaction regions within each frame and exploiting their fine-grained alignmentsin time and space to minimize the impact of invalid backgrounds during theassessment. In addition, we construct fine-grained annotations of human-centricforeground action masks for the FineDiving dataset, called\textbf{FineDiving-HM}. With refined annotations on diverse target actionprocedures, FineDiving-HM can promote the development of real-world AQAsystems. Through extensive experiments, we demonstrate the effectiveness ofFineParser, which outperforms state-of-the-art methods while supporting moretasks of fine-grained action understanding. Data and code are available at\url{https://github.com/PKU-ICST-MIPL/FineParser_CVPR2024}.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp