Audio Visual Active Speaker Detection On Ava

Métriques

validation mean average precision

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle	validation mean average precision	Paper Title	Repository
LoCoNet	95.2%	LoCoNet: Long-Short Context Network for Active Speaker Detection
MAAS-TAN	88.8%	MAAS: Multi-modal Assignation for Active Speaker Detection
SPELL	94.2%	Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
3D-ResNet-GRU	84.0%	Multi-Task Learning for Audio Visual Active Speaker Detection	-
ASDNet	93.5%	How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
LoCoNet + Laser	95.3%	LASER: Lip Landmark Assisted Speaker Detection for Robustness
SPELL+	94.9%	Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
VGG-{LSTM+TCN} (ensemble)	87.8%	Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)	-
MAAS-LAN	85.1%	MAAS: Multi-modal Assignation for Active Speaker Detection
Active Speakers in Context	87.1%	Active Speakers in Context
LoCoNet+TalkNCE	95.5%	TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning
UniCon	92.0%	UniCon: Unified Context Network for Robust Active Speaker Detection	-
GSCMIA	92.86%	Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
VTP (visual only)	89.2%	Sub-word Level Lip Reading With Visual Attention	-
SA-uncertainty Fusion	91.9%	Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion	-
EASEE-50	94.1%	End-to-End Active Speaker Detection
Extended UniCon	93.6%	ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021	-
UniCon+	94.5%	UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022	-
Light-ASD	94.1%	A Light Weight Model for Active Speaker Detection
TalkNet	92.3%	NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)	-

0 of 20 row(s) selected.