Chadrockv2 Qwen3.6 27B ROCmFP6 STRIX QUALITY

Chadrockv2 Qwen3.6 27B ROCmFP6 STRIX QUALITY

Chadrockv2 Qwen3.6 27B ROCmFP6 STRIX QUALITY is an AMD-tuned GGUF release of the Unsloth Qwen3.6 27B MTP line. It uses a new ROCmFP6 Strix Quality recipe designed to recover Q6-class agent behavior while keeping the ROCmFPX served-speed advantages on AMD Ryzen AI Max+ 395 / Strix Halo systems.

This is a model/runtime pairing, not a generic GGUF quant. The file uses custom ROCmFPX tensor types and will not run correctly with stock upstream llama.cpp. Use the ROCmFPX branch and launch profile documented below.

Full research report:

https://llm.ciru.ai/reports/rocmfp6-quality-research-report-20260624/

Why This Build Exists

The earlier Strix speed ROCmFP6 recipe was too small for agent quality. It measured about 4.82 BPW and scored clearly below the downloaded Unsloth Q6 baseline on HermesAgent-20. This STRIX QUALITY recipe moves closer to a real Q6-class file by keeping the bulk of tensors in Q6_0_ROCMFPX and promoting high-impact tensors to Q8_0_ROCMFPX.

The result is larger than the old speed recipe but materially better on agent quality:

Model HermesAgent-20 score Base pass Plus pass HumanEval+ plus PPL
Chadrockv2 ROCmFP6 STRIX QUALITY 0.78 14/20 11/20 155/164 = 94.51% 6.5543 +/- 0.0941
Unsloth Q6 baseline 0.76 13/20 11/20 153/164 = 93.29% 6.5296 +/- 0.0934
Old ROCmFP6 Strix Speed 0.60 10/20 9/20 152/164 = 92.68% 6.4077 +/- 0.0902

The important lesson from the tuning run is that perplexity alone was not enough. The old small FP6 recipe looked acceptable by PPL, but failed agent scenarios. HermesAgent-20 and EvalPlus showed that the quality recipe recovered the behavior we needed.

Lineage

Qwen/Qwen3.6-27B
  -> unsloth/Qwen3.6-27B
  -> unsloth/Qwen3.6-27B-MTP-GGUF
  -> Chadrockv2 Qwen3.6 27B ROCmFP6 STRIX QUALITY

The public release name and artifact names are Chadrock names. The source lineage remains explicit in metadata, benchmark notes, and credits.

Files

File Size SHA256
Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf 25,196,024,736 bytes 144062b43fade17c15217acf0b4974041f6135d73945bc13e7c13b1d18946b84
Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf.sha256 checksum same hash as above
profiles/unsloth-qwen36-27b-mtp-rocmfp6-strix-quality-cap6-q8kv-rocm-hermes64k.env launch profile AMD Strix Halo ROCm profile

Recipe

Recipe Estimated size BPW Tensor mix
STRIX QUALITY 24018.32 MiB 7.37 312 Q6 tensors, 194 Q8 tensors
Straight Q6 ROCmFPX local dry-run 6.59 486 Q6 tensors, 20 Q8 tensors
Old Strix Speed local dry-run 4.82 388 FP4-fast tensors, 118 Q6 tensors
Q6 ROCmFPX Agent local dry-run 7.40 340 Q6 tensors, 166 Q8 tensors

STRIX QUALITY keeps the default tensor type at Q6_0_ROCMFPX, then promotes:

  • token embedding and output tensors
  • attention Q, K, V, O, and fused QKV tensors
  • selected FFN down/gate tensor bands
  • llama.cpp tensors marked by the use_more_bits heuristic

The recipe is implemented as:

LLAMA_FTYPE_MOSTLY_Q6_0_ROCMFPX_STRIX_QUALITY = 118
scripts/quantize-rocmfpx-agent.sh --profile strix-quality

Quality Results

HermesAgent-20 is the deciding quality test for this release because it exposes scenario-level failures that aggregate PPL missed.

Model Score Base pass Plus pass Generation time
Chadrockv2 ROCmFP6 STRIX QUALITY 0.78 14/20 11/20 1541.503 s
Unsloth Q6 baseline 0.76 13/20 11/20 1037.491 s
Old ROCmFP6 Strix Speed 0.60 10/20 9/20 791.457 s

EvalPlus confirms that the quality recipe did not trade away coding correctness:

Model HumanEval base HumanEval+
Chadrockv2 ROCmFP6 STRIX QUALITY 161/164 155/164 = 94.51%
Unsloth Q6 baseline 160/164 153/164 = 93.29%
Old ROCmFP6 Strix Speed 159/164 152/164 = 92.68%

Speed Results

All rows were measured locally on AMD Ryzen AI Max+ 395 / Strix Halo, one-slot served MTP, q8_0 target KV, f16 draft KV, b2048/u512, temperature=0, 512 generated tokens, and no prompt cache reuse.

ROCmFP6 STRIX QUALITY vs Unsloth Q6 Baseline

Prompt tokens FP6 ROCm PP tok/s FP6 ROCm TG tok/s FP6 total Q6 ROCm PP tok/s Q6 ROCm TG tok/s Q6 total
512 177.98 29.52 20.1 s 200.84 22.10 25.6 s
2048 188.44 20.64 34.7 s 208.53 17.38 38.4 s
4096 213.53 30.73 33.5 s 227.13 27.75 34.3 s
16384 223.76 30.03 85.9 s 218.75 25.76 90.3 s
65536 171.08 15.72 388.4 s 166.15 10.81 413.7 s

ROCm vs Vulkan for This FP6 File

Prompt tokens ROCm TG tok/s ROCm total Vulkan TG tok/s Vulkan total
512 29.52 20.1 s 19.58 28.9 s
2048 20.64 34.7 s 19.45 36.3 s
4096 30.73 33.5 s 13.10 57.6 s
16384 30.03 85.9 s 13.41 120.6 s
65536 15.72 388.4 s 9.19 471.6 s

ROCm0 is the recommended backend for this release. Vulkan remains useful as a portability path, but it was slower across this Strix Quality speed matrix.

Run With ROCmFPX

Build the ROCmFPX runner branch containing this ftype and recipe:

git clone https://github.com/ciru-ai/ROCmFPX.git
cd ROCmFPX
git checkout rocmfp6-strix-quality
cmake -S . -B build-strix-rocmfp6-quality-hip \
  -DGGML_HIP=ON \
  -DGGML_VULKAN=ON \
  -DCMAKE_BUILD_TYPE=Release
cmake --build build-strix-rocmfp6-quality-hip -j

Launch the validated AMD Strix Halo profile:

HSA_OVERRIDE_GFX_VERSION=11.5.1 \
GGML_HIP_ENABLE_UNIFIED_MEMORY=1 \
./build-strix-rocmfp6-quality-hip/bin/llama-server \
  -m /path/to/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf \
  --alias chadrockv2-qwen36-27b-rocmfp6-strix-quality \
  --host 127.0.0.1 \
  --port 8080 \
  --jinja \
  -c 65536 \
  -ngl 999 \
  -fa on \
  -dev ROCm0 \
  -sm none \
  -b 2048 \
  -ub 512 \
  -t 16 \
  -tb 32 \
  -ctk q8_0 \
  -ctv q8_0 \
  --ctx-checkpoints 0 \
  --checkpoint-every-n-tokens -1 \
  --spec-type draft-mtp \
  --spec-draft-device ROCm0 \
  --spec-draft-ngl all \
  --spec-draft-type-k f16 \
  --spec-draft-type-v f16 \
  --spec-draft-n-max 6 \
  --spec-draft-n-min 0 \
  --spec-draft-p-min 0.0 \
  --spec-draft-p-split 0.20 \
  --parallel 1 \
  --metrics \
  --no-mmproj \
  --no-context-shift \
  --reasoning off \
  --reasoning-format none \
  --reasoning-budget 0 \
  --temp 0 \
  --top-p 0.95 \
  --top-k 20 \
  --repeat-penalty 1.0 \
  --seed 123

The profile in this repository is the exact env profile used for the HermesAgent-20 lane:

profiles/unsloth-qwen36-27b-mtp-rocmfp6-strix-quality-cap6-q8kv-rocm-hermes64k.env

Provenance

Item Value
quant format Q6_0_ROCMFPX_STRIX_QUALITY
ROCmFPX branch rocmfp6-strix-quality
ROCmFPX commit 7026d4ea51acb6e314526506eccdccdc31987855
public report https://llm.ciru.ai/reports/rocmfp6-quality-research-report-20260624/
local source filename Qwen3.6-27B-MTP-BF16-to-Q6_0_ROCMFPX_STRIX_QUALITY.gguf
public filename Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf

The local source filename is intentionally not used as the public artifact name. The uploaded GGUF uses the clean Chadrockv2 release filename shown above.

Limitations

  • This is specifically AMD tuned, with Strix Halo as the measured target.
  • The GGUF requires a ROCmFPX-aware llama.cpp runner.
  • The recipe prioritizes agent quality and served decode speed, not smallest file size.
  • Benchmark numbers are local Strix Halo measurements and depend on driver version, clocks, prompt shape, KV cache settings, and draft-token acceptance.

Credits

  • Qwen: Qwen3.6 27B base model family.
  • Unsloth: Qwen3.6 27B MTP GGUF source lineage and Q6 baseline used for same-source comparison.
  • Charlie / ROCmFPX: ROCmFPX tensor formats and llama.cpp runtime work.
  • Ciru Inference Lab: AMD Strix Halo recipe tuning, quality evaluation, speed testing, and report publishing.
Downloads last month
420
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Base model

Qwen/Qwen3.6-27B
Quantized
(7)
this model