Long Video Retrieval Background Removed On
評価指標
Cap. Avg. R@1
Cap. Avg. R@10
Cap. Avg. R@5
DTW R@1
DTW R@10
DTW R@5
OTAM R@1
OTAM R@10
OTAM R@5
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | Cap. Avg. R@1 | Cap. Avg. R@10 | Cap. Avg. R@5 | DTW R@1 | DTW R@10 | DTW R@5 | OTAM R@1 | OTAM R@10 | OTAM R@5 |
---|---|---|---|---|---|---|---|---|---|
multi-granularity-correspondence-learning-1 | 75.5 | 97.7 | 95.0 | 88.7 | 99.5 | 98.8 | 88.9 | 99.5 | 98.4 |
multimodal-clustering-networks-for-self | 53.4 | 81.4 | 75.0 | - | - | - | - | - | - |
videoclip-contrastive-pre-training-for-zero | 74.5 | 97.9 | 94.5 | 56.0 | 89.9 | 96.3 | 52.8 | 89.2 | 95.0 |
end-to-end-learning-of-visual-representations | 43.1 | 79.1 | 68.6 | - | - | - | - | - | - |
tempclr-temporal-alignment-representation | 74.5 | 97.0 | 94.6 | 83.5 | 99.3 | 97.2 | 84.9 | 99.5 | 97.9 |
howto100m-learning-a-text-video-embedding-by | 46.6 | 83.7 | 74.3 | - | - | - | - | - | - |