Qwen3.6 27B MTPLX Optimized Quality

This is the MTPLX Optimized Quality artifact for Qwen3.6 27B.

Policy:

Target trunk: true flat 8-bit MLX affine quantization, group64.
MTP sidecar: calibrated CyanKiwi INT8 MTP linears converted from cyankiwi/Qwen3.6-27B-AWQ-BF16-INT8 compressed-tensors format to MLX affine group128.
MTP auxiliary tensors (mtp.fc, norms, scales/biases, and non-linear tensors) are preserved in BF16 where applicable.

This is intended as the higher-quality sibling to Qwen3.6-27B-MTPLX-Optimized-Speed. It favors the Flat8 target and calibrated INT8 proposal sidecar instead of the smaller speed-focused artifact.

MTPLX

mtplx start --model Youssofal/Qwen3.6-27B-MTPLX-Optimized-Quality

The artifact includes mtplx_runtime.json and mtp.safetensors, so MTPLX can inspect and route it through the native Qwen MTP backend.

Local bakeoff snapshot

Measured on the local M5 Max max-fan Flappy 2k depth-3 bakeoff:

Metric	Value
Decode TPS	33.63
Acceptance D1/D2/D3	95.6% / 85.3% / 74.1%
Verify ms/call	88.1 ms
Peak memory	27.62 GiB

This row is a local release-readiness check, not a broad public hardware claim.

Provenance

Base model: Qwen/Qwen3.6-27B
INT8 MTP source: cyankiwi/Qwen3.6-27B-AWQ-BF16-INT8
MTPLX staging manifest: mtplx_upload_manifest.json

Downloads last month: 1,672

Safetensors

Model size

27B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for Youssofal/Qwen3.6-27B-MTPLX-Optimized-Quality

Base model

Qwen/Qwen3.6-27B

Quantized

(336)

this model