Action Recognition In Videos On Hmdb 51
Metrics
Average accuracy of 3 splits
Results
Performance results of various models on this benchmark
Comparison Table
Model Name | Average accuracy of 3 splits |
---|---|
action-recognition-with-trajectory-pooled | 65.9 |
end-to-end-learning-of-motion-representation | 72.6 |
late-temporal-modeling-in-3d-cnn | 85.10 |
contextual-action-cues-from-camera-sensor-for | 80.92 |
learning-discriminative-video-representations | 74.3 |
actionflownet-learning-motion-representation | 56.4 |
appearance-and-relation-networks-for-video | 70.9 |
vimpac-video-pre-training-via-masked-token | 65.9 |
two-stream-convolutional-networks-for-action | 59.4 |
rethinking-spatiotemporal-feature-learning | 75.9 |
a-closer-look-at-spatiotemporal-convolutions | 76.4 |
videomoco-contrastive-video-representation | 49.2 |
high-order-tensor-pooling-with-attention-for | 85.70 |
quo-vadis-action-recognition-a-new-model-and | 77.3 |
representation-flow-for-action-recognition | 81.1 |
faster-recurrent-networks-for-video | 75.7 |
a-closer-look-at-spatiotemporal-convolutions | 66.6 |
dynamic-image-networks-for-action-recognition | 65.2 |
convnet-architecture-search-for | 54.9 |
paying-more-attention-to-motion-attention | 72.0 |
videomoco-contrastive-video-representation | 43.6 |
quo-vadis-action-recognition-a-new-model-and | 80.9 |
d3d-distilled-3d-networks-for-video-action | 80.5 |
omni-sourced-webly-supervised-learning-for | 83.8 |
quo-vadis-action-recognition-a-new-model-and | 80.7 |
learning-spatio-temporal-representation-with-3 | 78.9 |
d3d-distilled-3d-networks-for-video-action | 78.7 |
temporal-segment-networks-towards-good | 69.4 |
dmc-net-generating-discriminative-motion-cues | 62.8 |
hidden-two-stream-convolutional-networks-for | 78.7 |
quo-vadis-action-recognition-a-new-model-and | 74.3 |
distinit-learning-video-representations | 54.8 |
quo-vadis-action-recognition-a-new-model-and | 77.1 |
pose-and-joint-aware-action-recognition | 54.2 |
learning-spatiotemporal-features-with-3d | 51.6 |
convolutional-two-stream-network-fusion-for | 65.4 |
pose-and-joint-aware-action-recognition | 84.53 |
spatiotemporal-residual-networks-for-video | 70.3 |
r-stan-residual-spatial-temporal-attention | 55.16 |
self-supervised-video-transformer | 67.2 |
r-stan-residual-spatial-temporal-attention | 62.8 |
holistic-large-scale-video-understanding | 76.5 |
optical-flow-guided-feature-a-fast-and-robust | 74.2 |
learning-spatio-temporal-representations-with | 71.5 |
a-closer-look-at-spatiotemporal-convolutions | 78.7 |
dmc-net-generating-discriminative-motion-cues | 71.8 |
smart-frame-selection-for-action-recognition | 84.36 |
tensor-representations-for-action-recognition | 86.11 |
d3d-distilled-3d-networks-for-video-action | 79.3 |
a-closer-look-at-spatiotemporal-convolutions | 74.5 |
learning-spatio-temporal-representation-with-3 | 75.7 |
hierarchical-feature-aggregation-networks-for | 71.13 |
spatiotemporal-multiplier-networks-for-video | 72.2 |
bubblenet-a-disperse-recurrent-structure-to | 82.60 |
motionsqueeze-neural-motion-feature-learning | 77.4 |
a-closer-look-at-spatiotemporal-convolutions | 72.7 |
dmc-net-generating-discriminative-motion-cues | 77.8 |
hallucinating-statistical-moment-and-subspace | 87.56 |
cooperative-cross-stream-network-for | 81.9 |
hallucinating-bag-of-words-and-fisher-vector | 82.48 |
video-classification-with-finecoarse-networks | 77.6 |
ts-lstm-and-temporal-inception-exploiting | 69 |
videomae-v2-scaling-video-masked-autoencoders | 88.1 |
vidtr-video-transformer-without-convolutions | 74.4 |
susinet-see-understand-and-summarize-it | 62.7 |
learning-spatio-temporal-representation-with-3 | 80.5 |
a-closer-look-at-spatiotemporal-convolutions | 70.1 |
mars-motion-augmented-rgb-stream-for-action | 80.9 |
quo-vadis-action-recognition-a-new-model-and | 74.8 |
perf-net-pose-empowered-rgb-flow-net | 83.2 |
long-term-temporal-convolutions-for-action | 64.8 |
bidirectional-cross-modal-knowledge | 83.1 |
towards-universal-representation-for-unseen | 51.8 |
asymmetric-masked-distillation-for-pre | 79.6 |
zeroi2v-zero-cost-adaptation-of-pre-trained | 83.4 |
high-order-tensor-pooling-with-attention-for | 87.21 |