HyperAIHyperAI

Microsoft VibeVoice-1.5B Redefines the Boundaries of TTS Technology

1. Tutorial Introduction

Build

The computing resources used in this tutorial are a single RTX 4090 card.

2. Effect display

3. Operation steps

1. Start the container

2. Usage steps

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

Specific parameters:

  • Generation Parameters
    • CFG Scale: Adjust the consistency between generated audio and input dialogue text

result

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓