Moment Retrieval On Qvhighlights
Métriques
R@1 IoU=0.5
R@1 IoU=0.7
mAP
mAP@0.5
mAP@0.75
Résultats
Résultats de performance de divers modèles sur ce benchmark
Tableau comparatif
Nom du modèle | R@1 IoU=0.5 | R@1 IoU=0.7 | mAP | mAP@0.5 | mAP@0.75 |
---|---|---|---|---|---|
saliency-guided-detr-for-moment-retrieval-and | 72.20 | 56.60 | 54.10 | 73.20 | 55.80 |
prior-knowledge-integration-via-llm-encoding | 66.73 | 49.94 | 44.05 | 65.76 | 43.91 |
boundary-denoising-for-video-activity | 59.27 | 45.07 | - | - | - |
bam-detr-boundary-aligned-moment-detection | 62.71 | 48.64 | 45.36 | 64.57 | 46.33 |
umt-unified-multi-modal-transformers-for | - | - | 36.12 | - | - |
videolights-feature-refinement-and-cross-task | 70.36 | 55.25 | 47.94 | 69.53 | 49.17 |
univtg-towards-unified-video-language | 65.43 | 50.06 | 43.63 | 64.06 | 45.02 |
bridging-the-gap-a-unified-video | 64.53 | 48.31 | 43.8 | 64.78 | 43.65 |
query-dependent-video-representation-for | 62.40 | 44.98 | 39.86 | 62.52 | 39.88 |
r-2-tuning-efficient-image-to-video-transfer-1 | 68.03 | 49.35 | 46.17 | 69.04 | 47.56 |
flashvtg-feature-layering-and-adaptive-score | 70.69 | 53.96 | 52.00 | 72.33 | 53.85 |
Modèle 12 | 54.5 | 36.5 | 32.3 | - | - |
query-dependent-video-representation-for | 63.06 | 45.10 | 40.19 | 63.04 | 40.10 |
bam-detr-boundary-aligned-moment-detection | 64.07 | 48.12 | 46.91 | 65.61 | 47.51 |
correlation-guided-query-dependency | 65.43 | 48.38 | 42.86 | 64.51 | 42.77 |
unloc-a-unified-framework-for-video | 66.1 | 46.7 | - | - | - |
ld-detr-loop-decoder-detection-transformer | 66.80 | 51.04 | 46.41 | 67.61 | 46.99 |
length-aware-detr-for-robust-moment-retrieval | 63.94 | 51.10 | 47.93 | 65.65 | 49.44 |
llava-mr-large-language-and-vision-assistant | 76.59 | 61.48 | 52.73 | 69.41 | 54.40 |
overcoming-weak-visual-textual-alignment-for | 60.12 | 43.05 | 40.08 | 63.08 | 40.18 |
query-dependent-video-representation-for | 64.1 | 46.1 | 40.62 | 64.3 | 40.5 |
saliency-guided-detr-for-moment-retrieval-and | 74.20 | 60.40 | 58.80 | 76.20 | 60.80 |
qvhighlights-detecting-moments-and-highlights | 59.78 | 40.33 | 36.14 | 60.51 | 35.36 |
correlation-guided-query-dependency | 68.48 | 53.11 | 47.97 | 69.40 | 49.12 |
query-dependent-video-representation-for | 63.2 | 45.2 | 40.0 | 63.4 | 40.4 |
umt-unified-multi-modal-transformers-for | - | - | 38.08 | - | - |
unloc-a-unified-framework-for-video | 64.5 | 48.8 | - | - | - |
bam-detr-boundary-aligned-moment-detection | 63.88 | 47.92 | 46.67 | 66.33 | 48.22 |
univtg-towards-unified-video-language | 58.86 | 40.86 | 35.47 | 57.60 | 35.59 |
video-mamba-suite-state-space-model-as-a | 66.65 | 52.19 | 45.18 | 64.37 | 46.68 |
bridging-the-gap-a-unified-video | 63.55 | 47.47 | 43.18 | 63.37 | 42.67 |
internvideo2-scaling-video-foundation-models | 71.42 | 56.45 | 49.24 | - | - |