//
1 min read

Meta launches AI tool AudioCraft that will turn simple text to audio and music

Meta has introduced a new open-source AI tool called AudioCraft, aimed at enabling both professional musicians and everyday users to generate audio and music from simple text prompts. The tool consists of three models: MusicGen, AudioGen, and EnCodec. MusicGen can generate music from text inputs using Meta’s music library, while AudioGen creates audio based on text inputs using public sound effects. The improved EnCodec decoder enhances music generation quality with fewer artifacts.

 
Meta is also making its pre-trained AudioGen models available, allowing users to generate environmental sounds and sound effects like dogs barking or cars honking. The company is sharing the model weights and code for AudioCraft, facilitating applications such as music composition, sound effects creation, compression algorithms, and audio generation. By open-sourcing these models, Meta aims to empower researchers and practitioners to develop their own models using their datasets.
 
 
While generative AI has seen significant advancements in images, video, and text, audio has lagged behind. AudioCraft aims to bridge this gap by offering a user-friendly platform for generating high-quality audio. Meta acknowledges the complexity of producing realistic and high-fidelity audio due to intricate signals and patterns at various scales, especially in music, which involves local and long-range patterns. AudioCraft simplifies generative audio model design and experimentation.

Leave a Reply