Instructions to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY",
	filename="Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY
# Run inference directly in the terminal:
llama cli -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY
# Run inference directly in the terminal:
llama cli -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY
# Run inference directly in the terminal:
./llama-cli -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY
# Run inference directly in the terminal:
./build/bin/llama-cli -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Use Docker

docker model run hf.co/jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

LM Studio
Jan

vLLM

How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Ollama
How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with Ollama:
```
ollama run hf.co/jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY
```

Unsloth Studio

How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY to start chatting

How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with Docker Model Runner:
```
docker model run hf.co/jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY
```

Lemonade

How to use jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Run and chat with the model

lemonade run user.Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY-{{QUANT_TAG}}

List all available models

lemonade list

Chadrockv2 Qwen3.6 27B ROCmFP6 STRIX QUALITY

Chadrockv2 Qwen3.6 27B ROCmFP6 STRIX QUALITY is an AMD-tuned GGUF release of the Unsloth Qwen3.6 27B MTP line. It uses a new ROCmFP6 Strix Quality recipe designed to recover Q6-class agent behavior while keeping the ROCmFPX served-speed advantages on AMD Ryzen AI Max+ 395 / Strix Halo systems.

This is a model/runtime pairing, not a generic GGUF quant. The file uses custom ROCmFPX tensor types and will not run correctly with stock upstream llama.cpp. Use the ROCmFPX branch and launch profile documented below.

Full research report:

https://llm.ciru.ai/reports/rocmfp6-quality-research-report-20260624/

Why This Build Exists

The earlier Strix speed ROCmFP6 recipe was too small for agent quality. It measured about 4.82 BPW and scored clearly below the downloaded Unsloth Q6 baseline on HermesAgent-20. This STRIX QUALITY recipe moves closer to a real Q6-class file by keeping the bulk of tensors in Q6_0_ROCMFPX and promoting high-impact tensors to Q8_0_ROCMFPX.

The result is larger than the old speed recipe but materially better on agent quality:

Model	HermesAgent-20 score	Base pass	Plus pass	HumanEval+ plus	PPL
Chadrockv2 ROCmFP6 STRIX QUALITY	`0.78`	`14/20`	`11/20`	`155/164 = 94.51%`	`6.5543 +/- 0.0941`
Unsloth Q6 baseline	`0.76`	`13/20`	`11/20`	`153/164 = 93.29%`	`6.5296 +/- 0.0934`
Old ROCmFP6 Strix Speed	`0.60`	`10/20`	`9/20`	`152/164 = 92.68%`	`6.4077 +/- 0.0902`

The important lesson from the tuning run is that perplexity alone was not enough. The old small FP6 recipe looked acceptable by PPL, but failed agent scenarios. HermesAgent-20 and EvalPlus showed that the quality recipe recovered the behavior we needed.

Lineage

Qwen/Qwen3.6-27B
  -> unsloth/Qwen3.6-27B
  -> unsloth/Qwen3.6-27B-MTP-GGUF
  -> Chadrockv2 Qwen3.6 27B ROCmFP6 STRIX QUALITY

The public release name and artifact names are Chadrock names. The source lineage remains explicit in metadata, benchmark notes, and credits.

Files

File	Size	SHA256
`Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf`	`25,196,024,736 bytes`	`144062b43fade17c15217acf0b4974041f6135d73945bc13e7c13b1d18946b84`
`Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf.sha256`	checksum	same hash as above
`profiles/unsloth-qwen36-27b-mtp-rocmfp6-strix-quality-cap6-q8kv-rocm-hermes64k.env`	launch profile	AMD Strix Halo ROCm profile

Recipe

Recipe	Estimated size	BPW	Tensor mix
STRIX QUALITY	`24018.32 MiB`	`7.37`	`312` Q6 tensors, `194` Q8 tensors
Straight Q6 ROCmFPX	local dry-run	`6.59`	`486` Q6 tensors, `20` Q8 tensors
Old Strix Speed	local dry-run	`4.82`	`388` FP4-fast tensors, `118` Q6 tensors
Q6 ROCmFPX Agent	local dry-run	`7.40`	`340` Q6 tensors, `166` Q8 tensors

STRIX QUALITY keeps the default tensor type at Q6_0_ROCMFPX, then promotes:

token embedding and output tensors
attention Q, K, V, O, and fused QKV tensors
selected FFN down/gate tensor bands
llama.cpp tensors marked by the use_more_bits heuristic

The recipe is implemented as:

LLAMA_FTYPE_MOSTLY_Q6_0_ROCMFPX_STRIX_QUALITY = 118
scripts/quantize-rocmfpx-agent.sh --profile strix-quality

Quality Results

HermesAgent-20 is the deciding quality test for this release because it exposes scenario-level failures that aggregate PPL missed.

Model	Score	Base pass	Plus pass	Generation time
Chadrockv2 ROCmFP6 STRIX QUALITY	`0.78`	`14/20`	`11/20`	`1541.503 s`
Unsloth Q6 baseline	`0.76`	`13/20`	`11/20`	`1037.491 s`
Old ROCmFP6 Strix Speed	`0.60`	`10/20`	`9/20`	`791.457 s`

EvalPlus confirms that the quality recipe did not trade away coding correctness:

Model	HumanEval base	HumanEval+
Chadrockv2 ROCmFP6 STRIX QUALITY	`161/164`	`155/164 = 94.51%`
Unsloth Q6 baseline	`160/164`	`153/164 = 93.29%`
Old ROCmFP6 Strix Speed	`159/164`	`152/164 = 92.68%`

Speed Results

All rows were measured locally on AMD Ryzen AI Max+ 395 / Strix Halo, one-slot served MTP, q8_0 target KV, f16 draft KV, b2048/u512, temperature=0, 512 generated tokens, and no prompt cache reuse.

ROCmFP6 STRIX QUALITY vs Unsloth Q6 Baseline

Prompt tokens	FP6 ROCm PP tok/s	FP6 ROCm TG tok/s	FP6 total	Q6 ROCm PP tok/s	Q6 ROCm TG tok/s	Q6 total
`512`	`177.98`	`29.52`	`20.1 s`	`200.84`	`22.10`	`25.6 s`
`2048`	`188.44`	`20.64`	`34.7 s`	`208.53`	`17.38`	`38.4 s`
`4096`	`213.53`	`30.73`	`33.5 s`	`227.13`	`27.75`	`34.3 s`
`16384`	`223.76`	`30.03`	`85.9 s`	`218.75`	`25.76`	`90.3 s`
`65536`	`171.08`	`15.72`	`388.4 s`	`166.15`	`10.81`	`413.7 s`

ROCm vs Vulkan for This FP6 File

Prompt tokens	ROCm TG tok/s	ROCm total	Vulkan TG tok/s	Vulkan total
`512`	`29.52`	`20.1 s`	`19.58`	`28.9 s`
`2048`	`20.64`	`34.7 s`	`19.45`	`36.3 s`
`4096`	`30.73`	`33.5 s`	`13.10`	`57.6 s`
`16384`	`30.03`	`85.9 s`	`13.41`	`120.6 s`
`65536`	`15.72`	`388.4 s`	`9.19`	`471.6 s`

ROCm0 is the recommended backend for this release. Vulkan remains useful as a portability path, but it was slower across this Strix Quality speed matrix.

Run With ROCmFPX

Build the ROCmFPX runner branch containing this ftype and recipe:

git clone https://github.com/ciru-ai/ROCmFPX.git
cd ROCmFPX
git checkout rocmfp6-strix-quality
cmake -S . -B build-strix-rocmfp6-quality-hip \
  -DGGML_HIP=ON \
  -DGGML_VULKAN=ON \
  -DCMAKE_BUILD_TYPE=Release
cmake --build build-strix-rocmfp6-quality-hip -j

Launch the validated AMD Strix Halo profile:

HSA_OVERRIDE_GFX_VERSION=11.5.1 \
GGML_HIP_ENABLE_UNIFIED_MEMORY=1 \
./build-strix-rocmfp6-quality-hip/bin/llama-server \
  -m /path/to/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf \
  --alias chadrockv2-qwen36-27b-rocmfp6-strix-quality \
  --host 127.0.0.1 \
  --port 8080 \
  --jinja \
  -c 65536 \
  -ngl 999 \
  -fa on \
  -dev ROCm0 \
  -sm none \
  -b 2048 \
  -ub 512 \
  -t 16 \
  -tb 32 \
  -ctk q8_0 \
  -ctv q8_0 \
  --ctx-checkpoints 0 \
  --checkpoint-every-n-tokens -1 \
  --spec-type draft-mtp \
  --spec-draft-device ROCm0 \
  --spec-draft-ngl all \
  --spec-draft-type-k f16 \
  --spec-draft-type-v f16 \
  --spec-draft-n-max 6 \
  --spec-draft-n-min 0 \
  --spec-draft-p-min 0.0 \
  --spec-draft-p-split 0.20 \
  --parallel 1 \
  --metrics \
  --no-mmproj \
  --no-context-shift \
  --reasoning off \
  --reasoning-format none \
  --reasoning-budget 0 \
  --temp 0 \
  --top-p 0.95 \
  --top-k 20 \
  --repeat-penalty 1.0 \
  --seed 123

The profile in this repository is the exact env profile used for the HermesAgent-20 lane:

profiles/unsloth-qwen36-27b-mtp-rocmfp6-strix-quality-cap6-q8kv-rocm-hermes64k.env

Provenance

Item	Value
quant format	`Q6_0_ROCMFPX_STRIX_QUALITY`
ROCmFPX branch	`rocmfp6-strix-quality`
ROCmFPX commit	`7026d4ea51acb6e314526506eccdccdc31987855`
public report	`https://llm.ciru.ai/reports/rocmfp6-quality-research-report-20260624/`
local source filename	`Qwen3.6-27B-MTP-BF16-to-Q6_0_ROCMFPX_STRIX_QUALITY.gguf`
public filename	`Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY.gguf`

The local source filename is intentionally not used as the public artifact name. The uploaded GGUF uses the clean Chadrockv2 release filename shown above.

Limitations

This is specifically AMD tuned, with Strix Halo as the measured target.
The GGUF requires a ROCmFPX-aware llama.cpp runner.
The recipe prioritizes agent quality and served decode speed, not smallest file size.
Benchmark numbers are local Strix Halo measurements and depend on driver version, clocks, prompt shape, KV cache settings, and draft-token acceptance.

Credits

Qwen: Qwen3.6 27B base model family.
Unsloth: Qwen3.6 27B MTP GGUF source lineage and Q6 baseline used for same-source comparison.
Charlie / ROCmFPX: ROCmFPX tensor formats and llama.cpp runtime work.
Ciru Inference Lab: AMD Strix Halo recipe tuning, quality evaluation, speed testing, and report publishing.

Downloads last month: 420

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for jcbtc/Chadrockv2-Qwen3.6-27B-ROCmFP6-STRIX-QUALITY

Base model

Qwen/Qwen3.6-27B

Finetuned

unsloth/Qwen3.6-27B

Quantized

(7)

this model