Action Recognition On Ava V2 2
評価指標
mAP
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | mAP |
---|---|
videomae-masked-autoencoders-are-data-1 | 36.1 |
videomae-masked-autoencoders-are-data-1 | 26.7 |
masked-feature-prediction-for-self-supervised | 39.8 |
improved-multiscale-vision-transformers-for | 34.4 |
videomae-masked-autoencoders-are-data-1 | 39.5 |
masked-video-distillation-rethinking-masked | 31.1 |
multiscale-vision-transformers | 27.3 |
internvideo-general-video-foundation-models | 41.01 |
masked-video-distillation-rethinking-masked | 38.7 |
videomae-masked-autoencoders-are-data-1 | 34.3 |
videomae-masked-autoencoders-are-data-1 | 39.3 |
masked-video-distillation-rethinking-masked | 34.2 |
slowfast-networks-for-video-recognition | 21.9 |
hiera-a-hierarchical-vision-transformer | 43.3 |
asymmetric-masked-distillation-for-pre | 33.5 |
holistic-interaction-transformer-network-for | 32.6 |
slowfast-networks-for-video-recognition | 23.8 |
end-to-end-spatio-temporal-action | 41.7 |
masked-video-distillation-rethinking-masked | 37.7 |
slowfast-networks-for-video-recognition | 27.5 |
videomae-v2-scaling-video-masked-autoencoders | 42.6 |
videomae-masked-autoencoders-are-data-1 | 37.8 |
videomae-masked-autoencoders-are-data-1 | 31.8 |
multiscale-vision-transformers | 26.1 |
actor-context-actor-relation-network-for | 31.72 |
towards-long-form-video-understanding-1 | 31.0 |
unmasked-teacher-towards-training-efficient | 39.8 |
multiscale-vision-transformers | 28.7 |
on-the-benefits-of-3d-pose-and-tracking-for | 45.1 |
videomae-masked-autoencoders-are-data-1 | 36.5 |
slowfast-networks-for-video-recognition | 27.1 |
multiscale-vision-transformers | 26.8 |
memvit-memory-augmented-multiscale-vision | 35.4 |
multiscale-vision-transformers | 24.5 |
masked-video-distillation-rethinking-masked | 41.1 |
masked-video-distillation-rethinking-masked | 40.1 |
object-region-video-transformers-1 | 26.6 |
multiscale-vision-transformers | 27.5 |