Action Classification On Kinetics 700
評価指標
Top-1 Accuracy
Top-5 Accuracy
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Top-1 Accuracy | Top-5 Accuracy |
---|---|---|
learn-to-cycle-time-consistent-feature | 49.43 | 73.23 |
improved-multiscale-vision-transformers-for | 76.6 | 93.2 |
learn-to-cycle-time-consistent-feature | 53.52 | 74.17 |
movinets-mobile-video-networks-for-efficient | 63.5 | - |
movinets-mobile-video-networks-for-efficient | 66.7 | - |
movinets-mobile-video-networks-for-efficient | 68.0 | - |
vidtr-video-transformer-without-convolutions | 69.5 | 88.3 |
internvideo-general-video-foundation-models | 84.0 | - |
eva-exploring-the-limits-of-masked-visual | 82.9% | - |
movinets-mobile-video-networks-for-efficient | 70.7 | - |
uniformerv2-spatiotemporal-learning-by-arming | 82.7 | 96.2 |
masked-feature-prediction-for-self-supervised | 80.4 | 95.7 |
internvideo2-scaling-video-foundation-models | 85.4 | - |
learn-to-cycle-time-consistent-feature | 56.46 | 76.82 |
vidtr-video-transformer-without-convolutions | 70.2 | 89 |
learn-to-cycle-time-consistent-feature | 49.15 | 72.68 |
mplug-2-a-modularized-multi-modal-foundation | 80.4 | 94.9 |
unmasked-teacher-towards-training-efficient | 83.6 | 96.7 |
co-training-transformer-with-videos-and | 79.8 | 94.9 |
improved-multiscale-vision-transformers-for | 79.4 | 94.9 |
movinets-mobile-video-networks-for-efficient | 71.7 | - |
movinets-mobile-video-networks-for-efficient | 72.3 | - |
learn-to-cycle-time-consistent-feature | 54.17 | 74.62 |
co-training-transformer-with-videos-and | 78.5 | 94.2 |
coca-contrastive-captioners-are-image-text | 81.1 | - |
hiera-a-hierarchical-vision-transformer | 81.1 | - |
rethinking-video-vits-sparse-video-tubes-for | 83.8 | 96.6 |
vision-models-are-more-robust-and-fair-when | 51.9 | - |
coca-contrastive-captioners-are-image-text | 82.7 | - |
aim-adapting-image-models-for-efficient-video | 80.4 | - |
internvideo2-scaling-video-foundation-models | 85.9 | - |
vidtr-video-transformer-without-convolutions | 67.3 | 87.7 |
improved-multiscale-vision-transformers-for | 79.4 | - |
movinets-mobile-video-networks-for-efficient | 58.5 | - |
vidtr-video-transformer-without-convolutions | 70.8 | 89.4 |
multiview-transformers-for-video-recognition | 83.4 | 96.2 |