Qwable 27B Chadrock ROCmFPX UltraQuality

Qwable 27B Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Qwable 27B Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW is the new quality-first ROCmFPX GGUF for the Unsloth Qwen3.6 27B MTP line. It replaces the older STRIX QUALITY naming and recipe as the default quality build.

The headline is simple: this is the high-quality Strix Halo ROCmFPX build that keeps the speed path alive without accepting the quality drift seen in earlier small mixed-precision experiments. On the fresh card refresh, it landed at 82 on HermesAgent-20, 154/164 on HumanEval+, 25.92 served MTP decode tok/s on a 20KB prompt, and only 0.002420 mean KLD against the BF16 reference.

This is a model/runtime pairing, not a stock upstream GGUF. The files use ROCmFPX tensor types and should be run with a ROCmFPX-aware llama.cpp runner.

File

File Role BPW Size Quality position
Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW.gguf default 7.6146 26,004,616,416 bytes best current ROCmFPX quality recipe

Fresh Comparison

Refresh date: 2026-06-29. Hardware: AMD Ryzen AI Max+ 395 / Strix Halo. Served rows used ROCm, one MTP slot, q8_0/q8_0 target KV, f16/f16 draft KV, draft cap 6, b2048/u512, temperature=0, 512 generated tokens, and a deterministic 20KB prompt measuring 3,946 prompt tokens.

Served MTP Speed

Model BPW Prompt tok/s Decode tok/s Total time Draft accepted Note
UltraQuality 7.61 BPW 7.6146 209.84 25.92 38.56 s 437/439 = 99.5% new default quality build
Superseded STRIX QUALITY 7.37 177.02 8.37 83.49 s 217/1762 = 12.3% historical row, not recommended

UltraQuality is over 3.0x the served decode speed of the superseded old STRIX QUALITY row in this refresh, while also improving the distribution-quality metrics below.

File Quality

PPL was measured with llama-perplexity, WikiText raw, n_ctx=2048, 32 chunks. KLD was measured with llama-perplexity --kl-divergence, BF16 reference, n_ctx=512, 16 chunks.

Model PPL Mean KLD KLD p99 KLD p99.9 Same-top
UltraQuality 7.61 BPW 6.5212 +/- 0.09323 0.002420 +/- 0.000481 0.019161 0.150872 97.843% +/- 0.227
Superseded STRIX QUALITY 6.5097 +/- 0.09282 0.007113 +/- 0.001182 0.057581 0.308613 96.495% +/- 0.288

The PPL row is intentionally not the final quality judge here. UltraQuality is the model that preserves the BF16 distribution closely enough to be the quality default.

Agent And Coding Validation

HermesAgent-20 and EvalPlus are the behavioral checks that catch failures PPL can miss. UltraQuality was rerun for this card refresh. Historical comparison rows are retained only to show what the new default replaces.

Model HermesAgent-20 HumanEval base HumanEval+ Harness failures
UltraQuality 7.61 BPW 82 160/164 = 97.56% 154/164 = 93.90% 0/164
Superseded STRIX QUALITY 78 161/164 = 98.17% 155/164 = 94.51% 0/164
Unsloth Q6 comparison not rerun in refresh 160/164 = 97.56% 153/164 = 93.29% 0/164

The important result is the combined shape: UltraQuality keeps Q6-class coding behavior, beats the old STRIX QUALITY row on HermesAgent-20, and reduces KLD drift versus the historical quality recipe.

Recipe Notes

UltraQuality is the user-facing name for the ranked leave-32 ROCmFPX recipe from the current tuning pass. The local build artifact was the attention-rank-leave32/Q6K-splice candidate, promoted here under the clean public name:

Qwable 27B Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

The older STRIX QUALITY recipe used broad Q6/Q8 promotion and was good enough to show the quality direction, but it had bad served-MTP draft behavior in the refresh. UltraQuality protects the tensors that mattered more surgically, which is why its KLD tail and MTP acceptance recovered at the same time.

Run With ROCmFPX

Build or use a ROCmFPX-aware llama.cpp runner, then launch the default UltraQuality file with the served MTP profile below.

HSA_OVERRIDE_GFX_VERSION=11.5.1 \
GGML_HIP_ENABLE_UNIFIED_MEMORY=1 \
./llama-server \
  -m /models/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW.gguf \
  --alias qwable-27b-chadrock-rocmfpx-ultraquality-7p61bpw \
  --host 127.0.0.1 \
  --port 8080 \
  --jinja \
  -c 65536 \
  -ngl 999 \
  -fa on \
  -dev ROCm0 \
  -sm none \
  -b 2048 \
  -ub 512 \
  -t 16 \
  -tb 32 \
  -ctk q8_0 \
  -ctv q8_0 \
  --ctx-checkpoints 0 \
  --checkpoint-every-n-tokens -1 \
  --spec-type draft-mtp \
  --spec-draft-device ROCm0 \
  --spec-draft-ngl all \
  --spec-draft-type-k f16 \
  --spec-draft-type-v f16 \
  --spec-draft-n-max 6 \
  --spec-draft-n-min 0 \
  --spec-draft-p-min 0.0 \
  --spec-draft-p-split 0.20 \
  --parallel 1 \
  --metrics \
  --no-mmproj \
  --no-context-shift \
  --reasoning off \
  --reasoning-format none \
  --reasoning-budget 0 \
  --temp 0 \
  --top-p 0.95 \
  --top-k 20 \
  --repeat-penalty 1.0 \
  --seed 123

A matching profile is included at:

profiles/qwable-27b-chadrock-rocmfpx-ultraquality-7p61bpw-rocm-mtp.env

Checksums

File SHA256
Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW.gguf 14cb3fb0670163a1b0f73c5df521ce0513cfddd7609d75d0640d00a07537073e

Evidence

Local refresh artifacts used for this card:

speed: card-refresh-20260629 served MTP refresh
quality: WikiText PPL/KLD file refresh, HermesAgent-20, EvalPlus HumanEval+

The public names intentionally hide the internal recipe filenames. The internal UltraQuality source artifact was the ranked leave-32/Q6K-splice GGUF from the ROCmFPX tuning run.

Limitations

  • This is specifically tuned and measured for AMD Strix Halo / Ryzen AI Max+ 395 with ROCm.
  • Stock upstream llama.cpp is not enough; use a ROCmFPX-aware runner.
  • The headline speed row is a 20KB served-MTP prompt refresh, not a full long-context sweep.
  • UMA memory reporting on this platform does not map cleanly to a simple discrete-GPU VRAM number, so this card uses file size and BPW as the public size metrics.

Credits

  • Qwen: Qwen3.6 base model family.
  • Unsloth: Qwen3.6 27B MTP GGUF source lineage.
  • Charlie / ROCmFPX: ROCmFPX tensor formats and llama.cpp runtime work.
  • Ciru Inference Lab: ROCmFPX recipe tuning, Strix Halo benchmarking, and model-card validation.
Downloads last month
441
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Base model

Qwen/Qwen3.6-27B
Quantized
(10)
this model