HyperAIHyperAI

Command Palette

Search for a command to run...

HiFiTTS-2 Large-Scale High-Bandwidth Speech Dataset

Date

3 months ago

Organization

NVIDIA

Paper URL

2506.04152

License

CC BY 4.0

Join the Discord Community

HiFiTTS-2 is a large-scale high-bandwidth speech dataset released by NVIDIA in 2025. The related paper results are "HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset", designed to support the training and evaluation of high-quality zero-shot text-to-speech (TTS) models.

This dataset contains audio metadata from 5,000 speakers, approximately 36,700 hours of English speech recordings at 22.05 kHz and 31,700 hours at 44.1 kHz, organized strata by bandwidth quality and sampling rate. The data is sourced from LibriVox audiobooks, available for download from LibriVox, and is sampled at 48 kHz, making it suitable for training high-resolution vocoders and non-autoregressive speech synthesis models.

The data includes:

  • Voice audio (22 kHz / 44 kHz, mono)
  • Transcript and chapter/episode metadata
  • Speaker and bandwidth quality estimation, segmentation timestamp
  • Training/validation manifests and example configurations

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp