HyperAI

Action Recognition On Ava V2 2

Metriken

mAP

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Vergleichstabelle
ModellnamemAP
videomae-masked-autoencoders-are-data-136.1
videomae-masked-autoencoders-are-data-126.7
masked-feature-prediction-for-self-supervised39.8
improved-multiscale-vision-transformers-for34.4
videomae-masked-autoencoders-are-data-139.5
masked-video-distillation-rethinking-masked31.1
multiscale-vision-transformers27.3
internvideo-general-video-foundation-models41.01
masked-video-distillation-rethinking-masked38.7
videomae-masked-autoencoders-are-data-134.3
videomae-masked-autoencoders-are-data-139.3
masked-video-distillation-rethinking-masked34.2
slowfast-networks-for-video-recognition21.9
hiera-a-hierarchical-vision-transformer43.3
asymmetric-masked-distillation-for-pre33.5
holistic-interaction-transformer-network-for32.6
slowfast-networks-for-video-recognition23.8
end-to-end-spatio-temporal-action41.7
masked-video-distillation-rethinking-masked37.7
slowfast-networks-for-video-recognition27.5
videomae-v2-scaling-video-masked-autoencoders42.6
videomae-masked-autoencoders-are-data-137.8
videomae-masked-autoencoders-are-data-131.8
multiscale-vision-transformers26.1
actor-context-actor-relation-network-for31.72
towards-long-form-video-understanding-131.0
unmasked-teacher-towards-training-efficient39.8
multiscale-vision-transformers28.7
on-the-benefits-of-3d-pose-and-tracking-for45.1
videomae-masked-autoencoders-are-data-136.5
slowfast-networks-for-video-recognition27.1
multiscale-vision-transformers26.8
memvit-memory-augmented-multiscale-vision35.4
multiscale-vision-transformers24.5
masked-video-distillation-rethinking-masked41.1
masked-video-distillation-rethinking-masked40.1
object-region-video-transformers-126.6
multiscale-vision-transformers27.5