Qwen3.6 27B MTPLX Optimized Quality

This is the MTPLX Optimized Quality artifact for Qwen3.6 27B.

Policy:

  • Target trunk: true flat 8-bit MLX affine quantization, group64.
  • MTP sidecar: calibrated CyanKiwi INT8 MTP linears converted from cyankiwi/Qwen3.6-27B-AWQ-BF16-INT8 compressed-tensors format to MLX affine group128.
  • MTP auxiliary tensors (mtp.fc, norms, scales/biases, and non-linear tensors) are preserved in BF16 where applicable.

This is intended as the higher-quality sibling to Qwen3.6-27B-MTPLX-Optimized-Speed. It favors the Flat8 target and calibrated INT8 proposal sidecar instead of the smaller speed-focused artifact.

MTPLX

mtplx start --model Youssofal/Qwen3.6-27B-MTPLX-Optimized-Quality

The artifact includes mtplx_runtime.json and mtp.safetensors, so MTPLX can inspect and route it through the native Qwen MTP backend.

Local bakeoff snapshot

Measured on the local M5 Max max-fan Flappy 2k depth-3 bakeoff:

Metric Value
Decode TPS 33.63
Acceptance D1/D2/D3 95.6% / 85.3% / 74.1%
Verify ms/call 88.1 ms
Peak memory 27.62 GiB

This row is a local release-readiness check, not a broad public hardware claim.

Provenance

  • Base model: Qwen/Qwen3.6-27B
  • INT8 MTP source: cyankiwi/Qwen3.6-27B-AWQ-BF16-INT8
  • MTPLX staging manifest: mtplx_upload_manifest.json
Downloads last month
1,672
Safetensors
Model size
27B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Youssofal/Qwen3.6-27B-MTPLX-Optimized-Quality

Base model

Qwen/Qwen3.6-27B
Quantized
(336)
this model