YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

These are the MTP tensors for Qwen3.6-27B, quantized at 4 bpw.

Qwen3.5/3.6 checkpoints quantized with exllamav3 v0.0.41+ will automatically contain MTP layers. For quantized checkpoints from earlier versions, place this file in the model directory to enable MTP drafting.

Quick benchmark (RTX 6000 Pro, 4.00 bpw target model):

Category Baseline MTP (greedy)
Agentic, code 63.88 t/s 180.51 t/s, 2.83x
Agentic, curl 61.42 t/s 154.40 t/s, 2.51x
Coding 68.17 t/s 187.97 t/s, 2.76x
Creative 68.29 t/s 138.22 t/s, 2.02x
Creative (reasoning) 68.01 t/s 135.00 t/s, 1.98x
Translation 66.88 t/s 139.41 t/s, 2.08x
Translation (reasoning) 66.79 t/s 162.90 t/s, 2.44x
Trivial repetition 68.87 t/s 237.59 t/s, 3.45x
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support