Natural Language Moment Retrieval On
評価指標
R@1,IoU=0.5
R@1,IoU=0.7
R@5,IoU=0.5
R@5,IoU=0.7
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | R@1,IoU=0.5 | R@1,IoU=0.7 | R@5,IoU=0.5 | R@5,IoU=0.7 |
---|---|---|---|---|
dense-regression-network-for-video-grounding | 45.45 | 24.36 | 77.97 | 50.30 |
learning-grounded-vision-language | 49.18 | 29.69 | - | - |
unloc-a-unified-framework-for-video | 48.0 | 29.7 | 81.5 | 61.4 |
unloc-a-unified-framework-for-video | 48.3 | 30.2 | 79.2 | 61.3 |
learning-grounded-vision-language | 60.67 | 38.55 | - | - |
llava-mr-large-language-and-vision-assistant | 55.16 | 35.68 | - | - |
vlg-net-video-language-graph-matching-network | 46.32 | 29.82 | 77.15 | 63.33 |
unimd-towards-unifying-moment-retrieval-and | - | - | 80.54 | 57.04 |