Speech Separation On Lrs2
Métriques
PESQ
SDRi
SI-SNRi
STOI
Résultats
Résultats de performance de divers modèles sur ce benchmark
Nom du modèle | PESQ | SDRi | SI-SNRi | STOI | Paper Title | Repository |
---|---|---|---|---|---|---|
TDFNet (MHSA + Shared) | 3.16 | 15.2 | 15.0 | 0.938 | TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion | |
RTFS-Net-12 | - | 15.1 | 14.9 | - | RTFS-Net: Recurrent Time-Frequency Modelling for Efficient Audio-Visual Speech Separation | |
TDFNet-small | 3.10 | 13.7 | 13.6 | 0.931 | TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion | |
IIANet | - | 16.6 | 16.4 | - | IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation | |
CTCNet | - | - | 14.3 | - | An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits | |
RTFS-Net-6 | - | 14.8 | 14.6 | - | RTFS-Net: Recurrent Time-Frequency Modelling for Efficient Audio-Visual Speech Separation | |
TDFNet-large | 3.21 | 15.9 | 15.8 | 0.949 | TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion | |
RTFS-Net-4 | - | 14.3 | 14.1 | - | RTFS-Net: Recurrent Time-Frequency Modelling for Efficient Audio-Visual Speech Separation |
0 of 8 row(s) selected.