HyperAIHyperAI

Command Palette

Search for a command to run...

OuteTTS: Speech Generation Engine

1. Tutorial Introduction

Build
  • Text-to-speech synthesis: Input text to generate natural and fluent speech output, supporting customizable speech speed and intonation.
  • Voice cloning: Users can provide reference audio as short as a few seconds and corresponding text to create personalized voices, which is suitable for customized voice assistants, audiobooks and other scenarios.

The model used in this tutorial is the Llama-OuteTTS-1.0-1B model released by Oute AI in March 2025. The parameters have been increased from 350 million to 1 billion, significantly enhancing the voice expressiveness and stability. It also supports localized synthesis in 20 languages, and the cross-language cloning capability has been further optimized.

The computing resources of this tutorial use a single RTX 4090 card. This tutorial mainly provides two usage examples of Default Speaker and Voice Cloning. This tutorial only supports English.

2. Effect display

3. Operation steps

1. Start the container

2. Usage steps

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

When using the Safari browser, the audio may not be played directly and needs to be downloaded before playing.

Specific parameters:

  • Text: Enter the text to be generated.
  • Temperature: Scaling factor that controls the randomness of the output.
  • Repetition Penalty: Penalty coefficient for suppressing repeated generation.
  • Top-k: Limit the number of candidate words generated at each step.
  • Top-p: Dynamic candidate word selection (kernel sampling).
  • Minimum Probability (min-p): Sets the minimum probability threshold for candidate words.

1. Default Speaker

2. Voice Cloning

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp