Temporal Action Localization On Epic Kitchens
Metrics
Avg mAP (0.1-0.5)
mAP IOU@0.1
mAP IOU@0.2
mAP IOU@0.3
mAP IOU@0.4
mAP IOU@0.5
Results
Performance results of various models on this benchmark
Model Name | Avg mAP (0.1-0.5) | mAP IOU@0.1 | mAP IOU@0.2 | mAP IOU@0.3 | mAP IOU@0.4 | mAP IOU@0.5 | Paper Title | Repository |
---|---|---|---|---|---|---|---|---|
G-TAD (verb) | 9.4 | 12.1 | 11.0 | 9.4 | 8.1 | 6.5 | G-TAD: Sub-Graph Localization for Temporal Action Detection | |
AdaTAD (verb, VideoMAE-L) | 29.3 | 33.1 | 32.2 | 30.4 | 27.5 | 23.1 | End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames | |
TriDet (verb) | 25.4 | 28.6 | 27.4 | 26.1 | 24.2 | 20.8 | TriDet: Temporal Action Detection with Relative Boundary Modeling | |
ActionFormer (verb) | 23.5 | 26.6 | 25.4 | 24.2 | 22.3 | 19.1 | ActionFormer: Localizing Moments of Actions with Transformers | |
BMN (verb) | 8.4 | 10.8 | 9.8 | 8.4 | 7.1 | 5.6 | BMN: Boundary-Matching Network for Temporal Action Proposal Generation | |
TemporalMaxer (verb) | 24.5 | 27.8 | 26.6 | 25.3 | 23.1 | 19.9 | TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization |
0 of 6 row(s) selected.