HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
SOTA
Speech

Speech

Speech technology refers to the capability of computer systems to process human speech, aiming to achieve speech recognition, synthesis, and understanding. Its goal is to build intelligent systems that can interact efficiently, enhancing user experience. It is widely applied in virtual assistants, customer service systems, voice translation, and other fields, significantly promoting the naturalness and convenience of human-computer interaction.

Speech Recognition

136 papers | 65 benchmarks

Speech Separation

47 papers | 19 benchmarks

Speech Emotion Recognition

31 papers | 15 benchmarks

Speech Enhancement

64 papers | 14 benchmarks

Dialogue Generation

12 papers | 13 benchmarks

Speaker Diarization

10 papers | 12 benchmarks

Speaker Verification

12 papers | 12 benchmarks

Spoken language identification

6 papers | 12 benchmarks

Keyword Spotting

53 papers | 10 benchmarks

Automatic Speech Recognition (ASR)

11 papers | 8 benchmarks

Multimodal Emotion Recognition

12 papers | 7 benchmarks

Automatic Phoneme Recognition

1 papers | 6 benchmarks

Bandwidth Extension

2 papers | 6 benchmarks

Text-To-Speech Synthesis

15 papers | 6 benchmarks

Automatic Lyrics Transcription

2 papers | 5 benchmarks

Speech Dereverberation

6 papers | 5 benchmarks

Speech Synthesis

19 papers | 5 benchmarks

Spoken Language Understanding

20 papers | 5 benchmarks

Story Generation

2 papers | 5 benchmarks

Accented Speech Recognition

2 papers | 4 benchmarks

Audio-Visual Speech Recognition

19 papers | 4 benchmarks

Speaker Identification

9 papers | 4 benchmarks

Speech-to-Speech Translation

5 papers | 3 benchmarks

Voice Conversion

3 papers | 3 benchmarks

Arabic Text Diacritization

7 papers | 2 benchmarks

Distant Speech Recognition

5 papers | 2 benchmarks

Noisy Speech Recognition

5 papers | 2 benchmarks

Speech Denoising

1 papers | 2 benchmarks

Speech Synthesis - Gujarati

2 papers | 2 benchmarks

Visual Speech Recognition

2 papers | 2 benchmarks

1 papers | 1 benchmarks

1 papers | 1 benchmarks

1 papers | 1 benchmarks

Acoustic Unit Discovery

1 papers | 1 benchmarks

Audio Deepfake Detection

8 papers | 1 benchmarks

Cultural Vocal Bursts Intensity Prediction

2 papers | 1 benchmarks

Lip to Speech Synthesis

1 papers | 1 benchmarks

Phone-level pronunciation scoring

6 papers | 1 benchmarks

Speaker Recognition

2 papers | 1 benchmarks

Speech Extraction

1 papers | 1 benchmarks

Speech Synthesis - Assamese

1 papers | 1 benchmarks

Speech Synthesis - Bengali

1 papers | 1 benchmarks

Speech Synthesis - Bodo

1 papers | 1 benchmarks

Speech Synthesis - Hindi

1 papers | 1 benchmarks

Speech Synthesis - Kannada

1 papers | 1 benchmarks

Speech Synthesis - Malayalam

1 papers | 1 benchmarks

Speech Synthesis - Manipuri

1 papers | 1 benchmarks

Speech Synthesis - Marathi

1 papers | 1 benchmarks

Speech Synthesis - Rajasthani

1 papers | 1 benchmarks

Speech Synthesis - Tamil

1 papers | 1 benchmarks

Speech Synthesis - Telugu

1 papers | 1 benchmarks

Spoken Command Recognition

3 papers | 1 benchmarks

Vocal Bursts Type Prediction

1 papers | 1 benchmarks

Utterance-level pronounciation scoring

3 papers | 1 benchmarks

Voice Query Recognition

1 papers | 1 benchmarks

Word-level pronunciation scoring

3 papers | 1 benchmarks

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Main

GPU

Console
Studio
Docs
Pricing

Pulse

News

Resources

Papers
Notebooks
Datasets
Wiki

Benchmarks

SOTA
LLM Models
GPU Leaderboard

Community

Events

Utility

About Terms of Service Privacy Policy
English

Command Palette

Search for a command to run...

HyperAI
SOTA
Speech

Speech

Speech technology refers to the capability of computer systems to process human speech, aiming to achieve speech recognition, synthesis, and understanding. Its goal is to build intelligent systems that can interact efficiently, enhancing user experience. It is widely applied in virtual assistants, customer service systems, voice translation, and other fields, significantly promoting the naturalness and convenience of human-computer interaction.

Speech Recognition

136 papers | 65 benchmarks

Speech Separation

47 papers | 19 benchmarks

Speech Emotion Recognition

31 papers | 15 benchmarks

Speech Enhancement

64 papers | 14 benchmarks

Dialogue Generation

12 papers | 13 benchmarks

Speaker Diarization

10 papers | 12 benchmarks

Speaker Verification

12 papers | 12 benchmarks

Spoken language identification

6 papers | 12 benchmarks

Keyword Spotting

53 papers | 10 benchmarks

Automatic Speech Recognition (ASR)

11 papers | 8 benchmarks

Multimodal Emotion Recognition

12 papers | 7 benchmarks

Automatic Phoneme Recognition

1 papers | 6 benchmarks

Bandwidth Extension

2 papers | 6 benchmarks

Text-To-Speech Synthesis

15 papers | 6 benchmarks

Automatic Lyrics Transcription

2 papers | 5 benchmarks

Speech Dereverberation

6 papers | 5 benchmarks

Speech Synthesis

19 papers | 5 benchmarks

Spoken Language Understanding

20 papers | 5 benchmarks

Story Generation

2 papers | 5 benchmarks

Accented Speech Recognition

2 papers | 4 benchmarks

Audio-Visual Speech Recognition

19 papers | 4 benchmarks

Speaker Identification

9 papers | 4 benchmarks

Speech-to-Speech Translation

5 papers | 3 benchmarks

Voice Conversion

3 papers | 3 benchmarks

Arabic Text Diacritization

7 papers | 2 benchmarks

Distant Speech Recognition

5 papers | 2 benchmarks

Noisy Speech Recognition

5 papers | 2 benchmarks

Speech Denoising

1 papers | 2 benchmarks

Speech Synthesis - Gujarati

2 papers | 2 benchmarks

Visual Speech Recognition

2 papers | 2 benchmarks

1 papers | 1 benchmarks

1 papers | 1 benchmarks

1 papers | 1 benchmarks

Acoustic Unit Discovery

1 papers | 1 benchmarks

Audio Deepfake Detection

8 papers | 1 benchmarks

Cultural Vocal Bursts Intensity Prediction

2 papers | 1 benchmarks

Lip to Speech Synthesis

1 papers | 1 benchmarks

Phone-level pronunciation scoring

6 papers | 1 benchmarks

Speaker Recognition

2 papers | 1 benchmarks

Speech Extraction

1 papers | 1 benchmarks

Speech Synthesis - Assamese

1 papers | 1 benchmarks

Speech Synthesis - Bengali

1 papers | 1 benchmarks

Speech Synthesis - Bodo

1 papers | 1 benchmarks

Speech Synthesis - Hindi

1 papers | 1 benchmarks

Speech Synthesis - Kannada

1 papers | 1 benchmarks

Speech Synthesis - Malayalam

1 papers | 1 benchmarks

Speech Synthesis - Manipuri

1 papers | 1 benchmarks

Speech Synthesis - Marathi

1 papers | 1 benchmarks

Speech Synthesis - Rajasthani

1 papers | 1 benchmarks

Speech Synthesis - Tamil

1 papers | 1 benchmarks

Speech Synthesis - Telugu

1 papers | 1 benchmarks

Spoken Command Recognition

3 papers | 1 benchmarks

Vocal Bursts Type Prediction

1 papers | 1 benchmarks

Utterance-level pronounciation scoring

3 papers | 1 benchmarks

Voice Query Recognition

1 papers | 1 benchmarks

Word-level pronunciation scoring

3 papers | 1 benchmarks

Build the Future of Artificial Intelligence

About

About Us Support Dataset Help

Products

News Papers Notebooks Datasets Wiki

Links

© HyperAI

GitHub Discord X (formerly Twitter)