YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
These are the MTP tensors for Qwen3.6-27B, quantized at 4 bpw.
Qwen3.5/3.6 checkpoints quantized with exllamav3 v0.0.41+ will automatically contain MTP layers. For quantized checkpoints from earlier versions, place this file in the model directory to enable MTP drafting.
Quick benchmark (RTX 6000 Pro, 4.00 bpw target model):
| Category | Baseline | MTP (greedy) |
|---|---|---|
| Agentic, code | 63.88 t/s | 180.51 t/s, 2.83x |
| Agentic, curl | 61.42 t/s | 154.40 t/s, 2.51x |
| Coding | 68.17 t/s | 187.97 t/s, 2.76x |
| Creative | 68.29 t/s | 138.22 t/s, 2.02x |
| Creative (reasoning) | 68.01 t/s | 135.00 t/s, 1.98x |
| Translation | 66.88 t/s | 139.41 t/s, 2.08x |
| Translation (reasoning) | 66.79 t/s | 162.90 t/s, 2.44x |
| Trivial repetition | 68.87 t/s | 237.59 t/s, 3.45x |
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support