One-click Deployment of Parler-TTS
Tutorial Introduction
Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural speech with a given speaker’s style. It has a high degree of freedom and innovation, and can control the speaker’s gender, timbre, intonation, and the scene (indoors, outdoors, on the road, in a concert hall, etc.) through prompts. It is the result of a paper titled “Parler-TTS” by Dan Lyth and Simon King of Stability AI and the University of Edinburgh. Natural language guide of high-fidelity text-to-speech with synthetic commenting"Reproduce the code.
Unlike other TTS models, Parler-TTS is completely open source. All datasets, preprocessing, training code, and weights are publicly released under a license, enabling the community to develop their own powerful TTS models based on the work of this tutorial. Note: This model does not yet support Chinese
Run steps
1. 克隆并启动容器,等待约 30s(加载模型),点击 API 地址即可进入 Web 界面(使用 RTX 4090 即可启动)

2. 输入要生成的文字和风格描述,点击提交即可生成
• Input Text: the text that needs to be converted into speech
• Description: A description of the audio character, scene, tone, timbre, etc., similar to Prompt. For example: A man voice speaks slightly slowly with very noisy background, carrying a low-pitch tone and displaying a touch of expressiveness and animation. The sound is very distant, adding an air of intrigue.
• Parler-TTS generation: generated audio files (can be listened to and downloaded)

Exchange and discussion
🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓