ACE-Step MLX Weights XL Q4

MLX-compatible int4 package for ACE-Step 1.5 XL-turbo. This repo follows the same core layout as mochiexists528/ace-step-mlx-weights-xl, but the XL DiT decoder is pre-quantized for lower runtime memory and the Q4 Phase 6 conditioning tensors are carried as detokenizer.safetensors.

Contents

text_encoder/                    Qwen3-Embedding-0.6B text encoder
dit/encoder.safetensors          XL-turbo DiT encoder tensors
dit/decoder.safetensors          XL-turbo DiT decoder tensors, int4
dit/manifest.json                quantization metadata
dit/silence_latent.safetensors   silence latent seed
vae/                             ACE-Step Oobleck VAE decoder
tokenizer/                       Qwen3 tokenizer files
detokenizer.safetensors          optional shared Phase 6 FSQ + AudioTokenDetokenizer sidecar

Phase 6 LM-Conditioning

Phase 6 is optional. Baseline XL-Q4 generation only needs the text encoder, DiT, VAE, tokenizer, and silence latent files above. To enable Phase 6 LM-conditioning, pass a shared planner snapshot plus this repo's shared conditioning sidecar. The recommended planner for device use is mochiexists528/ace-step-mlx-planner-1.7b-q4; the fp16 planner is a reference/desktop option.

SNAP=/path/to/ace-step-mlx-weights-xl-q4
PLANNER=/path/to/ace-step-mlx-planner-1.7b-q4

AceStepCLI generate \
  --text-encoder-dir "$SNAP/text_encoder" \
  --dit-dir "$SNAP/dit" \
  --vae-dir "$SNAP/vae" \
  --tokenizer-dir "$SNAP/tokenizer" \
  --lm-weights "$PLANNER" \
  --lm-conditioning-checkpoint "$SNAP/detokenizer.safetensors" \
  --quantize \
  ...

detokenizer.safetensors contains only the FSQ projection and AudioTokenDetokenizer tensors used by the validated Phase 6 path. It is shared with the turbo package; the planner produces DiT-agnostic 25 Hz acoustic hints, and the XL DiT consumes those hints during diffusion. The sidecar is not loaded for baseline generation.

Source Attribution

This is a derivative repack of upstream ACE-Step and Qwen artifacts for MLX. The XL-turbo DiT and VAE come from ACE-Step 1.5. The text encoder, tokenizer, and planner base family come from Qwen3. Upstream bfloat16 tensors are stored as fp16 where needed for MLX safetensors compatibility. The DiT decoder is quantized with MLX affine int4 quantization.

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support