Instructions to use AEmotionStudio/musicgen-style-models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Audiocraft
How to use AEmotionStudio/musicgen-style-models with Audiocraft:
from audiocraft.models import MusicGen model = MusicGen.get_pretrained("AEmotionStudio/musicgen-style-models") descriptions = ['happy rock', 'energetic EDM', 'sad jazz'] wav = model.generate(descriptions) # generates 3 samples. - Notebooks
- Google Colab
- Kaggle
MusicGen-Style — AEmotionStudio mirror
1:1 mirror of facebook/musicgen-style. Used by the MAESTRO / Æmotion Studio AI Workstation's MusicGen Style panel (Design → MusicGen Style).
License — Non-Commercial
Weights: CC-BY-NC-4.0. Generated outputs may NOT be used in commercial projects, paid releases, or client work.
Code (audiocraft): MIT. The MAESTRO runner is also MIT; the non-commercial clause attaches only to the weights and to anything derived from running them.
If you need a permissive substitute for commercial work, see Stable Audio Open (Stability Community License) instead.
Format
This mirror keeps the upstream .bin layout (PyTorch pickle)
verbatim — state_dict.bin (the 1.5 B language model) plus
compression_state_dict.bin (the EnCodec compression model).
We do NOT convert to safetensors here because audiocraft's
MusicGen.get_pretrained() loader expects pickled {xp.cfg, best_state} packages and pulls the OmegaConf cfg blob alongside the tensor dict in one torch.load call. Converting
would require splitting cfg into a sidecar and bypassing the
upstream loader — deferred to a follow-up.
PyTorch 2.6+'s default weights_only=True rejects these pickles
(numpy scalars in xp.cfg). MAESTRO's runner wraps the load in a
_TorchLoadWeightsOnlyShim context manager; vanilla audiocraft
users on torch ≥ 2.6 will hit the same issue and need a similar
shim.
Loading
from audiocraft.models import MusicGen
model = MusicGen.get_pretrained('AEmotionStudio/musicgen-style-models', device='cuda')
model.set_generation_params(duration=10)
model.set_style_conditioner_params(eval_q=3, excerpt_length=3.0)
import torchaudio
wav, sr = torchaudio.load('reference.wav')
out = model.generate_with_chroma(
descriptions=['ambient piano'],
melody_wavs=wav.unsqueeze(0), melody_sample_rate=sr,
)
torchaudio.save('out.wav', out[0].cpu(), model.sample_rate)
Citation
Audio style conditioning is described in Meta's paper:
Lan, S.-W., Defossez, A., Adi, Y., & Pasi, M. (2024). Combining audio control and style transfer using latent diffusion. arXiv:2407.12563.
- Downloads last month
- 23