Action Recognition On Charades Ego

mAP

Results

Performance results of various models on this benchmark

Model Name	mAP	Paper Title
HierVL	33.8	HierVL: Learning Hierarchical Video-Language Embeddings
EgoVLP	32.1	Egocentric Video-Language Pretraining
LaViLa (Zero-shot, TimeSformer-L)	28.9	Learning Video Representations from Large Language Models
LaViLa (Finetuned, TimeSformer-L)	36.1	Learning Video Representations from Large Language Models
EgoVLPv2	34.1	EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
HierVL (Zero-shot)	26	HierVL: Learning Hierarchical Video-Language Embeddings

0 of 6 row(s) selected.