HyperAI超神経
ホーム
ニュース
最新論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
ホーム
SOTA
Speech Separation
Speech Separation On Wsj0 2Mix
Speech Separation On Wsj0 2Mix
評価指標
Number of parameters (M)
SDRi
SI-SDRi
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
Number of parameters (M)
SDRi
SI-SDRi
Paper Title
Repository
TF-Locoformer (S) + DM
5.0
23
22.8
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
TD-Conformer (XL) + DM
-
-
21.2
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
Conv-TasNet
5.1
-
15.3
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Two-step Conv-TasNet
-
-
16.1
Two-Step Sound Source Separation: Training on Learned Latent Targets
TF-Locoformer (M)
15.0
23.8
23.6
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
DPTNet
-
-
20.2
Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation
SepReformer-L
59.4
25.2
25.1
Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation
Wavesplit v2
-
22.3
22.2
Wavesplit: End-to-End Speech Separation by Speaker Clustering
-
Deformable TCN + Dynamic Mixing
3.6
17.4
17.2
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation
Separate And Diffuse
-
-
23.9
Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
-
TasNet
-
-
10.8
TasNet: time-domain audio separation network for real-time, single-channel speech separation
Sandglasset
-
-
21.0
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
SPGM + DM
26.2
-
22.7
SPGM: Prioritizing Local Features for enhanced speech separation performance
Deep Clustering ++
-
-
10.8
Deep clustering: Discriminative embeddings for segmentation and separation
SPGM
26.2
-
22.1
SPGM: Prioritizing Local Features for enhanced speech separation performance
Sudo rm -rf (U=36)
-
-
19.5
Compute and memory efficient universal sound source separation
SepTDA (L=12)
-
-
24.0
Boosting Unknown-number Speaker Separation with Transformer Decoder-based Attractor
-
SepIt
-
-
22.4
SepIt: Approaching a Single Channel Speech Separation Bound
-
MossFormer (L) + DM
42.1
-
22.8
MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions
Gated DualPathRNN
-
-
20.12
Voice Separation with an Unknown Number of Multiple Speakers
0 of 38 row(s) selected.
Previous
Next