HyperAI

Principal

GPU

Console
Docs
Tarifs

Pulse

Actualités

Ressources

Publications de recherche
Notebooks
Jeux de données
Wiki

Benchmarks

SOTA
Modèles LLM
Classement des GPU

Communauté

Événements

Utilitaires

À propos Conditions d’utilisation Politique de confidentialité
Français

Command Palette

Search for a command to run...

HyperAI
SOTA
Reconnaissance vocale

Reconnaissance vocale

La reconnaissance vocale est la tâche consistant à convertir le langage parlé en texte, impliquant l'identification des mots à partir d'enregistrements audio et leur transcription en format écrit. Son objectif est de transcrire avec précision le contenu audio en temps réel ou à partir d'enregistrements audio, tout en prenant en compte des facteurs tels que les accents, le rythme de parole et le bruit de fond pour améliorer l'exactitude et la fiabilité de la transcription. Cette technologie a une valeur d'application significative dans des domaines comme l'interaction homme-machine, la génération automatique de sous-titres et les assistants vocaux.

LibriSpeech test-clean

HuBERT with Libri-Light

LibriSpeech test-other

wav2vec 2.0 with Libri-Light

Switchboard + Hub500

Common Voice German

wav2vec 2.0 XLS-R 1B + TEVR (5-gram)

swb_hub_500 WER fullSWBCH

QuartzNet15x5DE (D37)

Common Voice French

ConformerCTC-L (5-gram)

Common Voice Spanish

ConformerCTC-L (4-gram)

Paraformer-large

GigaSpeech TEST

Zipformer+pruned transducer w/ CR-CTC (no external language model)

Hub5'00 SwitchBoard

LAS + SpecAugment (with LM, Switchboard mild policy)

Libri-Light test-clean

Libri-Light test-other

CHiME-6 dev_gss12

Common Voice vi

Vietnamese end-to-end speech recognition using wav2vec 2.0 by VietAI

Europarl-ASR EN Guest-test

Triphone (39 features) + LDA and MLLT + SGMM

Speech Commands

khanhld/chunkformer-large-vie

Common Voice English

Whisper (Large v2)

Common Voice Italian

Whisper (Large v2)

Europarl-ASR EN MEP-test

Whisper-LLaMa-7b

AISHELL-2 Test Android

AISHELL-2 Test IOS

AISHELL-2 Test Mic

WavLM Large & EEND-vector clustering

CALLHOME Spanish Speech

Common Voice Frisian

Common Voice Japanese

Common Voice Portuguese

XLSR53 Wav2Vec2 Portuguese by Orlem Santos

Common Voice Russian

Whisper (Large v2)

facebook/multilingual_librispeech german

Conformer/Transformer-AED

Google Speech Commands - Musan

Hub5'00 CallHome

Hub5'00 FISHER-SWBD

LibriSpeech 100h test-clean

LibriSpeech 100h test-other

Branchformer + GFSA

LibriSpeech train-clean-100 test-clean

wav2vec_wav2letter

LibriSpeech train-clean-100 test-other

wav2vec_wav2letter

Switchboard (300hr)

Switchboard CallHome

Switchboard SWBD

AISHELL-2 Android

ATCOSIM corpus (Air Traffic Control Communications)

ATCOSIM dataset (Air Traffic Control Communications)

Common Voice 7.0 Abkhaz

Common Voice 7.0 Arabic

Common Voice 7.0 Bashkir

Common Voice 7.0 German

Common Voice 7.0 Hindi

Common Voice 7.0 Odia

Common Voice 7.0 Portuguese

Common Voice 7.0 Votic

Common Voice 8.0 Assamese

Common Voice 8.0 Basaa

Common Voice 8.0 Breton

Common Voice 8.0 Bulgarian

Common Voice 8.0 Central Kurdish

Common Voice 8.0 Dutch

Common Voice 8.0 Erzya

Common Voice 8.0 French

Common Voice 8.0 Galician

Common Voice 8.0 German

Common Voice 8.0 Guarani

Common Voice 8.0 Hausa

Common Voice 8.0 Hindi

Common Voice 8.0 Hungarian

Common Voice 8.0 Japanese

Common Voice 8.0 Kabyle

Common Voice 8.0 Kazakh

Common Voice 8.0 Kurmanji Kurdish

Common Voice 8.0 Maltese

Common Voice 8.0 Marathi

Common Voice 8.0 Odia

Common Voice 8.0 Portuguese

Common Voice 8.0 Punjabi

Common Voice 8.0 Romansh Sursilvan

Common Voice 8.0 Romansh Vallader

Common Voice 8.0 Russian

Common Voice 8.0 Santali (Ol Chiki)

Common Voice 8.0 Serbian

Common Voice 8.0 Slovenian

Common Voice 8.0 Sorbian, Upper

Common Voice 8.0 Swahili

Common Voice 8.0 Tatar

Common Voice 8.0 Uzbek

Common Voice 8.0 Votic

Common Voice Arabic

Common Voice Breton

Common Voice Catalan

Common Voice Chinese (China)

Common Voice Czech

Common Voice Dutch

Common Voice Georgian

Common Voice Hindi

Common Voice Indonesian

Common Voice Lithuanian

Common Voice Maltese

Common Voice Odia

Common Voice Persian

Common Voice Polish

Common Voice Swedish

Common Voice Tamil

Common Voice Turkish

Common Voice Vietnamese

Common Voice Welsh

German ASR Data-Mix

Kazakh Speech Corpus v1.1

Mozilla Common Voice 15.0 Persian

Mozilla Common Voice 16.1

Mozilla Common Voice 9.0

projecte-aina/parlament_parla ca

Robust Speech Event - Catalan Dev Data

Robust Speech Event - Dev Data

Russian LibriSpeech

UWB-ATCC dataset (Air Traffic Control Communications)

Construire l’avenir de l’intelligence artificielle

À propos

À propos Aide relative au jeu de données

Produits

Actualités Publications de recherche Notebooks Jeux de données Wiki

Liens

© HyperAI

GitHub Discord X (formerly Twitter)

HyperAI

Principal

GPU

Console
Docs
Tarifs

Pulse

Actualités

Ressources

Publications de recherche
Notebooks
Jeux de données
Wiki

Benchmarks

SOTA
Modèles LLM
Classement des GPU

Communauté

Événements

Utilitaires

À propos Conditions d’utilisation Politique de confidentialité
Français

Command Palette

Search for a command to run...

HyperAI
SOTA
Reconnaissance vocale

Reconnaissance vocale

La reconnaissance vocale est la tâche consistant à convertir le langage parlé en texte, impliquant l'identification des mots à partir d'enregistrements audio et leur transcription en format écrit. Son objectif est de transcrire avec précision le contenu audio en temps réel ou à partir d'enregistrements audio, tout en prenant en compte des facteurs tels que les accents, le rythme de parole et le bruit de fond pour améliorer l'exactitude et la fiabilité de la transcription. Cette technologie a une valeur d'application significative dans des domaines comme l'interaction homme-machine, la génération automatique de sous-titres et les assistants vocaux.

LibriSpeech test-clean

HuBERT with Libri-Light

LibriSpeech test-other

wav2vec 2.0 with Libri-Light

Switchboard + Hub500

Common Voice German

wav2vec 2.0 XLS-R 1B + TEVR (5-gram)

swb_hub_500 WER fullSWBCH

QuartzNet15x5DE (D37)

Common Voice French

ConformerCTC-L (5-gram)

Common Voice Spanish

ConformerCTC-L (4-gram)

Paraformer-large

GigaSpeech TEST

Zipformer+pruned transducer w/ CR-CTC (no external language model)

Hub5'00 SwitchBoard

LAS + SpecAugment (with LM, Switchboard mild policy)

Libri-Light test-clean

Libri-Light test-other

CHiME-6 dev_gss12

Common Voice vi

Vietnamese end-to-end speech recognition using wav2vec 2.0 by VietAI

Europarl-ASR EN Guest-test

Triphone (39 features) + LDA and MLLT + SGMM

Speech Commands

khanhld/chunkformer-large-vie

Common Voice English

Whisper (Large v2)

Common Voice Italian

Whisper (Large v2)

Europarl-ASR EN MEP-test

Whisper-LLaMa-7b

AISHELL-2 Test Android

AISHELL-2 Test IOS

AISHELL-2 Test Mic

WavLM Large & EEND-vector clustering

CALLHOME Spanish Speech

Common Voice Frisian

Common Voice Japanese

Common Voice Portuguese

XLSR53 Wav2Vec2 Portuguese by Orlem Santos

Common Voice Russian

Whisper (Large v2)

facebook/multilingual_librispeech german

Conformer/Transformer-AED

Google Speech Commands - Musan

Hub5'00 CallHome

Hub5'00 FISHER-SWBD

LibriSpeech 100h test-clean

LibriSpeech 100h test-other

Branchformer + GFSA

LibriSpeech train-clean-100 test-clean

wav2vec_wav2letter

LibriSpeech train-clean-100 test-other

wav2vec_wav2letter

Switchboard (300hr)

Switchboard CallHome

Switchboard SWBD

AISHELL-2 Android

ATCOSIM corpus (Air Traffic Control Communications)

ATCOSIM dataset (Air Traffic Control Communications)

Common Voice 7.0 Abkhaz

Common Voice 7.0 Arabic

Common Voice 7.0 Bashkir

Common Voice 7.0 German

Common Voice 7.0 Hindi

Common Voice 7.0 Odia

Common Voice 7.0 Portuguese

Common Voice 7.0 Votic

Common Voice 8.0 Assamese

Common Voice 8.0 Basaa

Common Voice 8.0 Breton

Common Voice 8.0 Bulgarian

Common Voice 8.0 Central Kurdish

Common Voice 8.0 Dutch

Common Voice 8.0 Erzya

Common Voice 8.0 French

Common Voice 8.0 Galician

Common Voice 8.0 German

Common Voice 8.0 Guarani

Common Voice 8.0 Hausa

Common Voice 8.0 Hindi

Common Voice 8.0 Hungarian

Common Voice 8.0 Japanese

Common Voice 8.0 Kabyle

Common Voice 8.0 Kazakh

Common Voice 8.0 Kurmanji Kurdish

Common Voice 8.0 Maltese

Common Voice 8.0 Marathi

Common Voice 8.0 Odia

Common Voice 8.0 Portuguese

Common Voice 8.0 Punjabi

Common Voice 8.0 Romansh Sursilvan

Common Voice 8.0 Romansh Vallader

Common Voice 8.0 Russian

Common Voice 8.0 Santali (Ol Chiki)

Common Voice 8.0 Serbian

Common Voice 8.0 Slovenian

Common Voice 8.0 Sorbian, Upper

Common Voice 8.0 Swahili

Common Voice 8.0 Tatar

Common Voice 8.0 Uzbek

Common Voice 8.0 Votic

Common Voice Arabic

Common Voice Breton

Common Voice Catalan

Common Voice Chinese (China)

Common Voice Czech

Common Voice Dutch

Common Voice Georgian

Common Voice Hindi

Common Voice Indonesian

Common Voice Lithuanian

Common Voice Maltese

Common Voice Odia

Common Voice Persian

Common Voice Polish

Common Voice Swedish

Common Voice Tamil

Common Voice Turkish

Common Voice Vietnamese

Common Voice Welsh

German ASR Data-Mix

Kazakh Speech Corpus v1.1

Mozilla Common Voice 15.0 Persian

Mozilla Common Voice 16.1

Mozilla Common Voice 9.0

projecte-aina/parlament_parla ca

Robust Speech Event - Catalan Dev Data

Robust Speech Event - Dev Data

Russian LibriSpeech

UWB-ATCC dataset (Air Traffic Control Communications)

Construire l’avenir de l’intelligence artificielle

À propos

À propos Aide relative au jeu de données

Produits

Actualités Publications de recherche Notebooks Jeux de données Wiki

Liens

© HyperAI

GitHub Discord X (formerly Twitter)