Stability AI Releases Stability Audio 3.0: Up to Six Minutes of Professional-Grade Music with Open Weights for Smaller Models
Today, Stability AI, the company behind Stable Diffusion, released its latest audio model family, Stable Audio 3.0, comprising four models: Small SFX (459 million parameters), Small (459 million parameters), Medium (1.4 billion parameters), and Large (2.7 billion parameters). The two smaller models are designed for edge devices capable of generating up to two minutes of audio and music. The Medium and Large models can produce complete musical compositions lasting six minutes and twenty seconds while maintaining coherent structure and melodic tonality—more than double the duration supported by Stable Audio 2.0, which was launched in 2024. Stability AI has made the Small SFX, Small, and Medium models available as open-weight releases for anyone to use and modify. Compared to the previous Stable Audio Open, which only supported generation lengths under 47 seconds, this next-generation offering represents a significant leap forward on the open-source front. In contrast, the Large model will be accessible exclusively via API and self-hosted paid services; enterprises with annual revenues exceeding $1 million must obtain enterprise licensing. Competition in the music generation space continues to intensify, with companies such as Google and ElevenLabs entering the field. However, recent lawsuits involving Suno and Udio suggest that securing data licenses through partnerships with record labels may prove critical for these platforms' long-term viability. Last year, Stability AI signed agreements with Warner Music Group and Universal Music Group, stating that its newest audio models were trained entirely on fully licensed data. Additionally, the company announced it is developing a specialized toolkit tailored for professional musicians. Former Chief Digital Officers at Universal Audio and Fender, Ethan Kaplan, have joined Stability AI to lead its professional music product line.
