Audio Visual Active Speaker Detection On Ava
Metriken
validation mean average precision
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | validation mean average precision |
---|---|
loconet-long-short-context-network-for-active | 95.2% |
maas-multi-modal-assignation-for-active | 88.8% |
learning-long-term-spatial-temporal-graphs | 94.2% |
multi-task-learning-for-audio-visual-active | 84.0% |
how-to-design-a-three-stage-architecture-for | 93.5% |
laser-lip-landmark-assisted-speaker-detection | 95.3% |
learning-long-term-spatial-temporal-graphs | 94.9% |
naver-at-activitynet-challenge-2019-task-b | 87.8% |
maas-multi-modal-assignation-for-active | 85.1% |
active-speakers-in-context | 87.1% |
talknce-improving-active-speaker-detection | 95.5% |
unicon-unified-context-network-for-robust | 92.0% |
audio-visual-activity-guided-cross-modal | 92.86% |
sub-word-level-lip-reading-with-visual | 89.2% |
active-speaker-detection-as-a-multi-objective | 91.9% |
end-to-end-active-speaker-detection | 94.1% |
ictcas-ucas-tal-submission-to-the-ava | 93.6% |
unicon-ictcas-ucas-submission-to-the-ava | 94.5% |
a-light-weight-model-for-active-speaker | 94.1% |
nus-hlt-report-for-activitynet-challenge-2021 | 92.3% |