Instructions to use mlx-community/audiogen-medium-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/audiogen-medium-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir audiogen-medium-mlx mlx-community/audiogen-medium-mlx
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
AudioGen Medium (MLX)
This is the MLX-native port of facebook/audiogen-medium, a 1.5B parameter autoregressive transformer for text-to-audio generation.
Model Details
- Architecture: Autoregressive Transformer LM over EnCodec discrete tokens
- Parameters: ~1.5B (LM) + EnCodec compression model
- Sampling rate: 16 kHz
- Frame rate: 50 Hz (4 codebooks, delayed pattern)
- Text encoder: T5-small (loaded separately)
- Max duration: 10 seconds (configurable)
Files
config.jsonโ Model configurationmodel.safetensorsโ LM + EnCodec weightsmodel.safetensors.index.jsonโ Weight index (for sharded variants)tokenizer.json/tokenizer_config.jsonโ T5 tokenizer files
Usage (Swift/MLX)
import MLXAudioGen
let model = try await AudioGenModel.fromPretrained(
modelFolder: modelURL,
t5Folder: t5URL
)
let audio = try await model.generateAudio(
description: "dog barking",
duration: 5.0,
cfgCoef: 3.0,
temperature: 1.0,
topK: 250
)
License
This model is published under the CC-BY-NC 4.0 license (non-commercial use only), following the original AudioGen license.
- Downloads last month
- 32
Model size
2B params
Tensor type
F32
ยท
F16 ยท
Hardware compatibility
Log In to add your hardware
Quantized
Model tree for mlx-community/audiogen-medium-mlx
Base model
facebook/audiogen-medium