Microsoft VibeVoice-1.5B Redefines the Boundaries of TTS Technology
1. Tutorial Introduction

The computing resources used in this tutorial are a single RTX 4090 card.
2. Effect display

3. Operation steps
1. Start the container

2. Usage steps
If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

Specific parameters:
- Generation Parameters
- CFG Scale: Adjust the consistency between generated audio and input dialogue text
result

4. Discussion
🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓
