Supervised Video Summarization On Tvsum
評価指標
F1-score (Augmented)
F1-score (Canonical)
Kendall's Tau
Spearman's Rho
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
比較表
モデル名 | F1-score (Augmented) | F1-score (Canonical) | Kendall's Tau | Spearman's Rho |
---|---|---|---|---|
clip-it-language-guided-video-summarization | 69.0 | 66.3 | 0.108 | 0.147 |
query-twice-dual-mixture-attention-meta | - | 61.4 | 0.203 | 0.267 |
video-joint-modelling-based-on-hierarchical | 61.9 | 60.9 | 0.097 | 0.105 |
deep-reinforcement-learning-for-unsupervised | 59.8 | 58.1 | - | - |
combining-global-and-local-attention-with | - | 61.0 | 0.157 | 0.206 |
csta-cnn-based-spatiotemporal-attention-for | - | - | 0.194 | 0.255 |
supervised-video-summarization-via-multiple | - | 67.5 | - | - |
video-summarization-based-on-video-text | - | 60.4 | 0.181 | 0.238 |
joint-video-summarization-and-moment | 64.2 | 63.4 | 0.134 | 0.163 |
dsnet-a-flexible-detect-to-summarize-network | 63.9 | 62.1 | - | - |
video-summarization-based-on-video-text | 61.8 | 60.3 | 0.177 | 0.233 |
supervised-video-summarization-via-multiple | - | 63.9 | - | - |
supervised-video-summarization-via-multiple | - | 59.8 | - | - |
supervised-video-summarization-via-multiple | - | 63.7 | - | - |
supervised-video-summarization-via-multiple | - | 61.5 | 0.190 | 0.210 |
discriminative-feature-learning-for | 57.1 | 58.5 | - | - |
combining-global-and-local-attention-with | - | 62.7 | - | - |
supervised-video-summarization-via-multiple | - | 61 | - | - |
relational-reasoning-over-spatial-temporal | 63.6 | 63.0 | 0.162 | 0.212 |
align-and-attend-multimodal-summarization | - | 63.4 | 0.137 | 0.165 |
hierarchical-multimodal-transformer-to | 60.3 | 60.1 | 0.096 | 0.107 |