Instructions to use mochiexists528/ace-step-mlx-weights-q4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mochiexists528/ace-step-mlx-weights-q4 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir ace-step-mlx-weights-q4 mochiexists528/ace-step-mlx-weights-q4
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
ACE-Step MLX Weights (int4 quantized)
MLX-compatible int4-quantized safetensors of ACE-Step 1.5 for on-device music generation on memory-constrained Apple Silicon devices (e.g. iPhone). Used by the mochi records iOS / macOS app for the Q4 variant.
For full-precision (fp16) weights, see the sibling repo
mochiexists528/ace-step-mlx-weights.
Repository contents
text_encoder/ Qwen3-Embedding-0.6B text encoder (fp16 safetensors — encoder kept FP for parity)
dit/encoder.safetensors DiT encoder blocks (int4 quantized)
dit/decoder.safetensors DiT decoder blocks (int4 quantized)
dit/manifest.json key → file map for the split DiT
dit/silence_latent.safetensors silence src_latents seed (fp16)
detokenizer.safetensors FSQ + AudioTokenDetokenizer keys (fp16, Q4-specific sidecar — see note)
vae/ Oobleck VAE decoder (fp16 safetensors)
tokenizer/ Qwen3 tokenizer files
Q4 detokenizer sidecar
The FSQ + AudioTokenDetokenizer tensors are bundled inline inside the upstream
fp16 DiT checkpoint. The int4 quantization step splits the DiT into
encoder.safetensors + decoder.safetensors + manifest.json, which does
not have a natural home for those tensors. They are carried as a separate
detokenizer.safetensors sidecar. The FP variant does not need this file.
Phase 6 planner LM
Phase 6 is optional. Baseline Q4 generation only needs the text encoder, DiT,
VAE, tokenizer, silence latent, and detokenizer files above. To enable Phase 6,
pass a shared planner snapshot with --lm-weights.
The recommended planner repos are:
mochiexists528/ace-step-mlx-planner-1.7b-q4for lower storage and memory usemochiexists528/ace-step-mlx-planner-1.7bfor fp16 desktop/reference runs
Source attribution
This repository is a derivative work. Source models, licenses, and the
upstream NOTICE are identical to the FP repo — see
mochiexists528/ace-step-mlx-weights
README for the full attribution table.
Summary:
- DiT, VAE: ACE-Step 1.5 (MIT) — upstream
- Text encoder, tokenizer: Qwen3 (Apache 2.0) — upstream
- Optional 5Hz planner LM: ACE-Step fine-tune of Qwen3-1.7B (Apache 2.0), distributed through the shared planner repos listed above
Modifications from upstream
Apache 2.0 §4(b) requires us to state changes. Two modifications beyond the upstream FP layout:
- dtype repack (same as FP repo): bf16 → fp16 cast for MLX compatibility.
- int4 quantization: the DiT encoder + decoder block weights are
quantized via
mlx.nn.quantize(group size 64, bits 4). This is a lossy modification — audio output differs from the FP path, particularly at low inference-step counts. See the mochi records project benchmarks for measured quality and memory deltas.
The text encoder and VAE are kept in fp16 in this repo (their quantization artifacts are disproportionate to memory savings). Planner weights are not duplicated in this model repo; use the shared fp16 or Q4 planner repo based on the working-set budget on your target device.
NOTICE (Apache 2.0 §4(d))
See the FP repo
mochiexists528/ace-step-mlx-weights
README for the Qwen3 NOTICE carry-forward and the ACE-Step MIT license text.
Both apply identically to this repo's contents.
Usage
hf download mochiexists528/ace-step-mlx-weights-q4 --local-dir ./models-q4
Then in the mochi records app, select the Q4 variant in Settings → Model Management, or let the in-app downloader pull this repo on first launch when running on a Q4-targeted device.
Trademark notice
"ACE-Step" and "Qwen" are names of their respective upstream projects. This repository is not endorsed by, sponsored by, or affiliated with either project. Use of the names is solely for accurate attribution of the underlying weights.
Quantized