Speech Emotion Recognition
Emotion recognition in speech is a task in speech processing and computational paralinguistics that aims to identify and classify the emotional states expressed by speakers through the analysis of speech patterns such as prosody, pitch, and rhythm, including happiness, anger, sadness, or frustration. This technology has significant application value in areas like human-computer interaction, mental health assessment, and customer service. For multimodal emotion recognition, please upload the results to the designated page.
CREMA-D
Vertically long patch ViT
Dusha Crowd
Dusha Podcast
Dusha baseline
EMODB
VGG-optiVMD
EmoDB Dataset
VQ-MAE-S-12 (Frame) + Query2Emo
IEMOCAP
SER with MTL
LSSED
PyResNet
MSP-IMPROV
emoDARTS
MSP-Podcast (Activation)
wav2small-Teacher
MSP-Podcast (Dominance)
wav2small-Teacher
MSP-Podcast (Valence)
Quechua-SER
LSTM
RAVDESS
xlsr-Wav2Vec2.0(FineTuning)
RESD
emotion2vec+base
ShEMO