Audio Visual Active Speaker Detection On Ava

評価指標

validation mean average precision

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名	validation mean average precision	Paper Title	Repository
LoCoNet	95.2%	LoCoNet: Long-Short Context Network for Active Speaker Detection
MAAS-TAN	88.8%	MAAS: Multi-modal Assignation for Active Speaker Detection
SPELL	94.2%	Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
3D-ResNet-GRU	84.0%	Multi-Task Learning for Audio Visual Active Speaker Detection	-
ASDNet	93.5%	How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
LoCoNet + Laser	95.3%	LASER: Lip Landmark Assisted Speaker Detection for Robustness
SPELL+	94.9%	Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
VGG-{LSTM+TCN} (ensemble)	87.8%	Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)	-
MAAS-LAN	85.1%	MAAS: Multi-modal Assignation for Active Speaker Detection
Active Speakers in Context	87.1%	Active Speakers in Context
LoCoNet+TalkNCE	95.5%	TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning
UniCon	92.0%	UniCon: Unified Context Network for Robust Active Speaker Detection	-
GSCMIA	92.86%	Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
VTP (visual only)	89.2%	Sub-word Level Lip Reading With Visual Attention	-
SA-uncertainty Fusion	91.9%	Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion	-
EASEE-50	94.1%	End-to-End Active Speaker Detection
Extended UniCon	93.6%	ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021	-
UniCon+	94.5%	UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022	-
Light-ASD	94.1%	A Light Weight Model for Active Speaker Detection
TalkNet	92.3%	NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)	-

0 of 20 row(s) selected.