Instructions to use mochiexists528/ace-step-mlx-weights-xl-q4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mochiexists528/ace-step-mlx-weights-xl-q4 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir ace-step-mlx-weights-xl-q4 mochiexists528/ace-step-mlx-weights-xl-q4
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
ACE-Step MLX Weights XL Q4
MLX-compatible int4 package for ACE-Step 1.5 XL-turbo. This repo follows the
same core layout as mochiexists528/ace-step-mlx-weights-xl, but the XL DiT
decoder is pre-quantized for lower runtime memory and the Q4 Phase 6
conditioning tensors are carried as detokenizer.safetensors.
Contents
text_encoder/ Qwen3-Embedding-0.6B text encoder
dit/encoder.safetensors XL-turbo DiT encoder tensors
dit/decoder.safetensors XL-turbo DiT decoder tensors, int4
dit/manifest.json quantization metadata
dit/silence_latent.safetensors silence latent seed
vae/ ACE-Step Oobleck VAE decoder
tokenizer/ Qwen3 tokenizer files
detokenizer.safetensors optional shared Phase 6 FSQ + AudioTokenDetokenizer sidecar
Phase 6 LM-Conditioning
Phase 6 is optional. Baseline XL-Q4 generation only needs the text encoder, DiT,
VAE, tokenizer, and silence latent files above. To enable Phase 6
LM-conditioning, pass a shared planner snapshot plus this repo's shared
conditioning sidecar. The recommended planner for device use is
mochiexists528/ace-step-mlx-planner-1.7b-q4; the fp16 planner is a
reference/desktop option.
SNAP=/path/to/ace-step-mlx-weights-xl-q4
PLANNER=/path/to/ace-step-mlx-planner-1.7b-q4
AceStepCLI generate \
--text-encoder-dir "$SNAP/text_encoder" \
--dit-dir "$SNAP/dit" \
--vae-dir "$SNAP/vae" \
--tokenizer-dir "$SNAP/tokenizer" \
--lm-weights "$PLANNER" \
--lm-conditioning-checkpoint "$SNAP/detokenizer.safetensors" \
--quantize \
...
detokenizer.safetensors contains only the FSQ projection and
AudioTokenDetokenizer tensors used by the validated Phase 6 path. It is shared
with the turbo package; the planner produces DiT-agnostic 25 Hz acoustic hints,
and the XL DiT consumes those hints during diffusion. The sidecar is not loaded
for baseline generation.
Source Attribution
This is a derivative repack of upstream ACE-Step and Qwen artifacts for MLX. The XL-turbo DiT and VAE come from ACE-Step 1.5. The text encoder, tokenizer, and planner base family come from Qwen3. Upstream bfloat16 tensors are stored as fp16 where needed for MLX safetensors compatibility. The DiT decoder is quantized with MLX affine int4 quantization.
Quantized