HyperAIHyperAI
2 months ago

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges

Chen, Guo ; Xing, Sen ; Chen, Zhe ; Wang, Yi ; Li, Kunchang ; Li, Yizhuo ; Liu, Yi ; Wang, Jiahao ; Zheng, Yin-Dong ; Huang, Bingkun ; Zhao, Zhiyu ; Pan, Junting ; Huang, Yifei ; Wang, Zun ; Yu, Jiashuo ; He, Yinan ; Zhang, Hongjie ; Lu, Tong ; Wang, Yali ; Wang, Limin ; Qiao, Yu
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges
Abstract

In this report, we present our champion solutions to five tracks at Ego4Dchallenge. We leverage our developed InternVideo, a video foundation model, forfive Ego4D tasks, including Moment Queries, Natural Language Queries, FutureHand Prediction, State Change Object Detection, and Short-term ObjectInteraction Anticipation. InternVideo-Ego4D is an effective paradigm to adaptthe strong foundation model to the downstream ego-centric video understandingtasks with simple head designs. In these five tasks, the performance ofInternVideo-Ego4D comprehensively surpasses the baseline methods and thechampions of CVPR2022, demonstrating the powerful representation ability ofInternVideo as a video foundation model. Our code will be released athttps://github.com/OpenGVLab/ego4d-eccv2022-solutions

InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges | Latest Papers | HyperAI