ACE-Step MLX Weights (int4 quantized)

MLX-compatible int4-quantized safetensors of ACE-Step 1.5 for on-device music generation on memory-constrained Apple Silicon devices (e.g. iPhone). Used by the mochi records iOS / macOS app for the Q4 variant.

For full-precision (fp16) weights, see the sibling repo mochiexists528/ace-step-mlx-weights.

Repository contents

text_encoder/                    Qwen3-Embedding-0.6B text encoder  (fp16 safetensors — encoder kept FP for parity)
dit/encoder.safetensors          DiT encoder blocks                  (int4 quantized)
dit/decoder.safetensors          DiT decoder blocks                  (int4 quantized)
dit/manifest.json                key → file map for the split DiT
dit/silence_latent.safetensors   silence src_latents seed             (fp16)
detokenizer.safetensors          FSQ + AudioTokenDetokenizer keys    (fp16, Q4-specific sidecar — see note)
vae/                             Oobleck VAE decoder                 (fp16 safetensors)
tokenizer/                       Qwen3 tokenizer files

Q4 detokenizer sidecar

The FSQ + AudioTokenDetokenizer tensors are bundled inline inside the upstream fp16 DiT checkpoint. The int4 quantization step splits the DiT into encoder.safetensors + decoder.safetensors + manifest.json, which does not have a natural home for those tensors. They are carried as a separate detokenizer.safetensors sidecar. The FP variant does not need this file.

Phase 6 planner LM

Phase 6 is optional. Baseline Q4 generation only needs the text encoder, DiT, VAE, tokenizer, silence latent, and detokenizer files above. To enable Phase 6, pass a shared planner snapshot with --lm-weights.

The recommended planner repos are:

Source attribution

This repository is a derivative work. Source models, licenses, and the upstream NOTICE are identical to the FP repo — see mochiexists528/ace-step-mlx-weights README for the full attribution table.

Summary:

  • DiT, VAE: ACE-Step 1.5 (MIT) — upstream
  • Text encoder, tokenizer: Qwen3 (Apache 2.0) — upstream
  • Optional 5Hz planner LM: ACE-Step fine-tune of Qwen3-1.7B (Apache 2.0), distributed through the shared planner repos listed above

Modifications from upstream

Apache 2.0 §4(b) requires us to state changes. Two modifications beyond the upstream FP layout:

  1. dtype repack (same as FP repo): bf16 → fp16 cast for MLX compatibility.
  2. int4 quantization: the DiT encoder + decoder block weights are quantized via mlx.nn.quantize (group size 64, bits 4). This is a lossy modification — audio output differs from the FP path, particularly at low inference-step counts. See the mochi records project benchmarks for measured quality and memory deltas.

The text encoder and VAE are kept in fp16 in this repo (their quantization artifacts are disproportionate to memory savings). Planner weights are not duplicated in this model repo; use the shared fp16 or Q4 planner repo based on the working-set budget on your target device.

NOTICE (Apache 2.0 §4(d))

See the FP repo mochiexists528/ace-step-mlx-weights README for the Qwen3 NOTICE carry-forward and the ACE-Step MIT license text. Both apply identically to this repo's contents.

Usage

hf download mochiexists528/ace-step-mlx-weights-q4 --local-dir ./models-q4

Then in the mochi records app, select the Q4 variant in Settings → Model Management, or let the in-app downloader pull this repo on first launch when running on a Q4-targeted device.

Trademark notice

"ACE-Step" and "Qwen" are names of their respective upstream projects. This repository is not endorsed by, sponsored by, or affiliated with either project. Use of the names is solely for accurate attribution of the underlying weights.

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support