HyperAI超神経

Action Recognition In Videos On Ucf101

評価指標

3-fold Accuracy

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名3-fold Accuracy
large-scale-video-classification-with-165.4
hallucinet-ing-spatiotemporal-representations79.83
dmc-net-generating-discriminative-motion-cues96.5
learning-spatio-temporal-representation-with-397
i3d-lstm-a-new-model-for-human-action95.1
vimpac-video-pre-training-via-masked-token92.7
quo-vadis-action-recognition-a-new-model-and97.8
real-time-action-recognition-with-enhanced86.4
spatiotemporal-residual-networks-for-video94.6
a-closer-look-at-spatiotemporal-convolutions93.3
self-supervised-video-transformer93.7
adaptive-frame-selection-in-two-dimensional-
mlgcn-multi-laplacian-graph-convolutional63.27
two-stream-video-classification-with-cross96.5
a-closer-look-at-spatiotemporal-convolutions93.6
quo-vadis-action-recognition-a-new-model-and96.7
hidden-two-stream-convolutional-networks-for97.1
omnivec2-a-novel-transformer-based-network99.6
multi-fiber-networks-for-video-recognition96.0
omnivec-learning-robust-representations-with99.6
a-closer-look-at-spatiotemporal-convolutions95.5
temporal-spatial-mapping-for-action94.3
ligar-lightweight-general-purpose-action94.85
vidtr-video-transformer-without-convolutions96.7
learning-spatio-temporal-representation-with-398.2
faster-recurrent-networks-for-video96.9
dmc-net-generating-discriminative-motion-cues92.3
quo-vadis-action-recognition-a-new-model-and95.1
quo-vadis-action-recognition-a-new-model-and93.4
dynamic-image-networks-for-action-recognition89.1
r-stan-residual-spatial-temporal-attention94.5
potion-pose-motion-representation-for-action29.3
asymmetric-masked-distillation-for-pre97.1
convolutional-two-stream-network-fusion-for92.5
quo-vadis-action-recognition-a-new-model-and98.0
learning-spatio-temporal-representation-with88.6
learning-spatiotemporal-features-with-3d82.3
multi-region-two-stream-r-cnn-for-action91.1
paying-more-attention-to-motion-attention95.7
action-recognition-with-trajectory-pooled91.5
learning-spatio-temporal-representation-with-396.8
contextual-action-cues-from-camera-sensor-for97.2
long-term-temporal-convolutions-for-action91.7
efficient-action-recognition-using-confidence91.2
can-spatiotemporal-3d-cnns-retrace-the94.5
bidirectional-cross-modal-knowledge98.8
video-classification-with-finecoarse-networks97.6
appearance-and-relation-networks-for-video94.3
towards-good-practices-for-very-deep-two91.4
omni-sourced-webly-supervised-learning-for98.6
two-stream-convolutional-networks-for-action88.0
distinit-learning-video-representations85.8
d3d-distilled-3d-networks-for-video-action97
d3d-distilled-3d-networks-for-video-action97.6
cooperative-cross-stream-network-for97.4
bubblenet-a-disperse-recurrent-structure-to97.62
optical-flow-guided-feature-a-fast-and-robust96
ts-lstm-and-temporal-inception-exploiting94.1
a2-nets-double-attention-networks96.4
a-closer-look-at-spatiotemporal-convolutions97.3
mars-motion-augmented-rgb-stream-for-action97.8
mars-motion-augmented-rgb-stream-for-action95.8
actionflownet-learning-motion-representation83.9
quo-vadis-action-recognition-a-new-model-and95.6
モデル 6535.2
end-to-end-learning-of-motion-representation95.4
rethinking-spatiotemporal-feature-learning96.8
a-closer-look-at-spatiotemporal-convolutions96.8
federated-self-supervised-learning-for-video-
enhancing-video-transformers-for-action99.7
perf-net-pose-empowered-rgb-flow-net98.6
quo-vadis-action-recognition-a-new-model-and96.5
d3d-distilled-3d-networks-for-video-action97.1
zeroi2v-zero-cost-adaptation-of-pre-trained98.6
beyond-short-snippets-deep-networks-for-video88.6
smart-frame-selection-for-action-recognition98.64
learning-spatio-temporal-representations-with95.2
a-closer-look-at-spatiotemporal-convolutions95
holistic-large-scale-video-understanding97.8
videomoco-contrastive-video-representation78.7
video-action-recognition-collaborative86.1
convnet-architecture-search-for85.8
videomoco-contrastive-video-representation74.1
an-image-is-worth-16x16-words-what-is-a-video97
temporal-segment-networks-towards-good94.2
r-stan-residual-spatial-temporal-attention91.5
towards-universal-representation-for-unseen42.5
transferring-textual-knowledge-for-visual98.2
dance-with-flow-two-in-one-stream-action92
videomae-v2-scaling-video-masked-autoencoders99.6