Video Grounding On Qvhighlights
Metriken
R@1,IoU=0.5
R@1,IoU=0.7
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Modellname | R@1,IoU=0.5 | R@1,IoU=0.7 | Paper Title | Repository |
---|---|---|---|---|
DiffusionVMR | 61.61 | 44.49 | DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection | - |
UMT | 56.23 | 41.18 | UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection | |
InternVideo2-6B | 71.42 | 56.45 | InternVideo2: Scaling Foundation Models for Multimodal Video Understanding | |
InternVideo2-1B | 70.00 | 54.45 | InternVideo2: Scaling Foundation Models for Multimodal Video Understanding | |
QD-DETR | 62.40 | 44.98 | Query-Dependent Video Representation for Moment Retrieval and Highlight Detection | |
Moment-DETR | 52.89 | 33.02 | Detecting Moments and Highlights in Videos via Natural Language Queries |
0 of 6 row(s) selected.