Speech Separation
语音分离(Speech Separation)是指从混合语音信号中提取所有重叠的语音源的任务。作为声源分离问题的一个特殊场景,语音分离主要关注于分离出多个同时发声的语音信号,而非音乐或噪声等其他干扰信号。该技术在多说话人环境下的语音识别、听力辅助设备及音频编辑等领域具有重要应用价值。
GRID corpus (mixed-speech)
iKala
U-Net
Libri10Mix
Libri15Mix
Hungarian PIT
Libri20Mix
Libri2Mix
MossFormer2 (w speed perturb)
Libri5Mix
Hungarian PIT
LibriCSS
Conformer (large)
LRS2
TDFNet-small
LRS3
IIANet
TCD-TIMIT corpus (mixed-speech)
VoxCeleb2
RTFS-Net-4
WHAM!
SepReformer-L + DM
WHAMR!
TF-Locoformer (M)
WSJ0-2mix
SepReformer-L
WSJ0-2mix-16k
MossFormer2
WSJ0-3mix
Gated DualPathRNN
WSJ0-4mix
WSJ0-5mix
Gated DualPathRNN