

Stable Audio Open 1.0
Overview :
Stable Audio Open 1.0 is an AI model that utilizes an autoencoder, T5-based text embeddings, and a transformer-based diffusion model to generate up to 47 seconds of stereo audio. It generates music and audio through text prompts, supporting research and experiments to explore the current capabilities of generative AI models. The model is trained on datasets from Freesound and the Free Music Archive (FMA), ensuring data diversity and copyright legality.
Target Users :
This product is suitable for music producers, audio engineers, researchers, and any individuals or teams interested in AI music generation. It provides artists with a tool to experiment and create new musical works, while offering researchers a platform to explore and improve generative AI models.
Use Cases
Music producers use this model to generate new background music based on text prompts.
Researchers leverage the model to analyze and improve the scientific understanding of generative AI models.
Audio engineers utilize the model to explore various sound effects generation based on different text prompts.
Features
Generates up to 47 seconds of stereo audio.
Supports a 44.1kHz audio sample rate.
Text-prompt based music and audio generation.
Utilizes an autoencoder to compress waveforms to manageable sequence lengths.
Employs T5-based text embedding techniques for text conditioning.
Diffusion model operates in the latent space of the autoencoder.
How to Use
Download and install the required stable-audio-tools library.
Download the pre-trained model using the provided code examples.
Set text and time conditions, defining the audio's start time and total duration.
Call the model to generate diffusion-conditioned audio.
Reshape, peak normalize, clip, convert to int16 format, and save the generated audio as a file.
Featured AI Tools

Gemini
Gemini is the latest generation of AI system developed by Google DeepMind. It excels in multimodal reasoning, enabling seamless interaction between text, images, videos, audio, and code. Gemini surpasses previous models in language understanding, reasoning, mathematics, programming, and other fields, becoming one of the most powerful AI systems to date. It comes in three different scales to meet various needs from edge computing to cloud computing. Gemini can be widely applied in creative design, writing assistance, question answering, code generation, and more.
AI Model
11.4M
Chinese Picks

Liblibai
LiblibAI is a leading Chinese AI creative platform offering powerful AI creative tools to help creators bring their imagination to life. The platform provides a vast library of free AI creative models, allowing users to search and utilize these models for image, text, and audio creations. Users can also train their own AI models on the platform. Focused on the diverse needs of creators, LiblibAI is committed to creating inclusive conditions and serving the creative industry, ensuring that everyone can enjoy the joy of creation.
AI Model
6.9M