@inproceedings{rlangman2025hifitts2, title={HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset}, author={Ryan Langman and Xuesong Yang and Paarth Neekhara and Shehzeen Hussain and Edresson Casanova and Evelina Bakhturina and Jason Li}, booktitle={Interspeech}, year={2025}, }

Date

a year ago

Organization

Paper URL

2506.04152

License

CC BY 4.0

The data includes:

Voice audio (22 kHz / 44 kHz, mono)
Transcript and chapter/episode metadata
Speaker and bandwidth quality estimation, segmentation timestamp
Training/validation manifests and example configurations

Citation

@inproceedings{rlangman2025hifitts2,
title={HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset},
author={Ryan Langman and Xuesong Yang and Paarth Neekhara and Shehzeen Hussain and Edresson Casanova and Evelina Bakhturina and Jason Li},
booktitle={Interspeech},
year={2025},
}

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Discuss on Discord

Date

a year ago

Organization

Paper URL

2506.04152

License

CC BY 4.0

The data includes:

Voice audio (22 kHz / 44 kHz, mono)
Transcript and chapter/episode metadata
Speaker and bandwidth quality estimation, segmentation timestamp
Training/validation manifests and example configurations

Citation

@inproceedings{rlangman2025hifitts2,
title={HiFiTTS-2: A Large-Scale High Bandwidth Speech Dataset},
author={Ryan Langman and Xuesong Yang and Paarth Neekhara and Shehzeen Hussain and Edresson Casanova and Evelina Bakhturina and Jason Li},
booktitle={Interspeech},
year={2025},
}

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

4 hours ago

ChartNet Chart Understanding Multimodal Dataset

25 days ago

TACK Targeted Chimera Knowledge Base Dataset

22 days ago

SMOL Multilingual Translation Parallel Dataset

a month ago

Long-Distance Wildfire & Smoke Detection Dataset

2 months ago

PanScale Remote Sensing Pancolor Sharpening Dataset

2 months ago

BRIGHT Disaster Building Assessment Dataset

8 days ago

Simple Voice Questions Dataset

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

HiFiTTS-2 Large-Scale High-Bandwidth Speech Dataset

The data includes:

Citation

Build AI with AI

HyperAI Newsletters

Command Palette

HiFiTTS-2 Large-Scale High-Bandwidth Speech Dataset

The data includes:

Citation

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

ChartNet Chart Understanding Multimodal Dataset

TACK Targeted Chimera Knowledge Base Dataset

SMOL Multilingual Translation Parallel Dataset

Long-Distance Wildfire & Smoke Detection Dataset

PanScale Remote Sensing Pancolor Sharpening Dataset

BRIGHT Disaster Building Assessment Dataset

Simple Voice Questions Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

HiFiTTS-2 Large-Scale High-Bandwidth Speech Dataset

The data includes:

Citation

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

ChartNet Chart Understanding Multimodal Dataset

TACK Targeted Chimera Knowledge Base Dataset

SMOL Multilingual Translation Parallel Dataset

Long-Distance Wildfire & Smoke Detection Dataset

PanScale Remote Sensing Pancolor Sharpening Dataset

BRIGHT Disaster Building Assessment Dataset

Simple Voice Questions Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

ChartNet Chart Understanding Multimodal Dataset

TACK Targeted Chimera Knowledge Base Dataset

SMOL Multilingual Translation Parallel Dataset

Long-Distance Wildfire & Smoke Detection Dataset

PanScale Remote Sensing Pancolor Sharpening Dataset

BRIGHT Disaster Building Assessment Dataset

Simple Voice Questions Dataset

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

ChartNet Chart Understanding Multimodal Dataset

TACK Targeted Chimera Knowledge Base Dataset

SMOL Multilingual Translation Parallel Dataset

Long-Distance Wildfire & Smoke Detection Dataset

PanScale Remote Sensing Pancolor Sharpening Dataset

BRIGHT Disaster Building Assessment Dataset

Simple Voice Questions Dataset