Instructions to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW",
	filename="Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW
# Run inference directly in the terminal:
llama cli -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW
# Run inference directly in the terminal:
llama cli -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW
# Run inference directly in the terminal:
./llama-cli -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW
# Run inference directly in the terminal:
./build/bin/llama-cli -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Use Docker

docker model run hf.co/jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

LM Studio
Jan

vLLM

How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Ollama
How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with Ollama:
```
ollama run hf.co/jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW
```

Unsloth Studio

How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW to start chatting

How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Run Hermes

hermes

Atomic Chat new

OpenClaw new

How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with OpenClaw:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW" \
  --custom-provider-id llama-cpp \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

Docker Model Runner
How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with Docker Model Runner:
```
docker model run hf.co/jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW
```

Lemonade

How to use jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Run and chat with the model

lemonade run user.Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW-{{QUANT_TAG}}

List all available models

lemonade list

Qwable 27B Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Qwable 27B Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW is the new quality-first ROCmFPX GGUF for the Unsloth Qwen3.6 27B MTP line. It replaces the older STRIX QUALITY naming and recipe as the default quality build.

The headline is simple: this is the high-quality Strix Halo ROCmFPX build that keeps the speed path alive without accepting the quality drift seen in earlier small mixed-precision experiments. On the fresh card refresh, it landed at 82 on HermesAgent-20, 154/164 on HumanEval+, 25.92 served MTP decode tok/s on a 20KB prompt, and only 0.002420 mean KLD against the BF16 reference.

This is a model/runtime pairing, not a stock upstream GGUF. The files use ROCmFPX tensor types and should be run with a ROCmFPX-aware llama.cpp runner.

File

File	Role	BPW	Size	Quality position
`Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW.gguf`	default	`7.6146`	`26,004,616,416 bytes`	best current ROCmFPX quality recipe

Fresh Comparison

Refresh date: 2026-06-29. Hardware: AMD Ryzen AI Max+ 395 / Strix Halo. Served rows used ROCm, one MTP slot, q8_0/q8_0 target KV, f16/f16 draft KV, draft cap 6, b2048/u512, temperature=0, 512 generated tokens, and a deterministic 20KB prompt measuring 3,946 prompt tokens.

Served MTP Speed

Model	BPW	Prompt tok/s	Decode tok/s	Total time	Draft accepted	Note
UltraQuality 7.61 BPW	`7.6146`	`209.84`	`25.92`	`38.56 s`	`437/439 = 99.5%`	new default quality build
Superseded STRIX QUALITY	`7.37`	`177.02`	`8.37`	`83.49 s`	`217/1762 = 12.3%`	historical row, not recommended

UltraQuality is over 3.0x the served decode speed of the superseded old STRIX QUALITY row in this refresh, while also improving the distribution-quality metrics below.

File Quality

PPL was measured with llama-perplexity, WikiText raw, n_ctx=2048, 32 chunks. KLD was measured with llama-perplexity --kl-divergence, BF16 reference, n_ctx=512, 16 chunks.

Model	PPL	Mean KLD	KLD p99	KLD p99.9	Same-top
UltraQuality 7.61 BPW	`6.5212 +/- 0.09323`	`0.002420 +/- 0.000481`	`0.019161`	`0.150872`	`97.843% +/- 0.227`
Superseded STRIX QUALITY	`6.5097 +/- 0.09282`	`0.007113 +/- 0.001182`	`0.057581`	`0.308613`	`96.495% +/- 0.288`

The PPL row is intentionally not the final quality judge here. UltraQuality is the model that preserves the BF16 distribution closely enough to be the quality default.

Agent And Coding Validation

HermesAgent-20 and EvalPlus are the behavioral checks that catch failures PPL can miss. UltraQuality was rerun for this card refresh. Historical comparison rows are retained only to show what the new default replaces.

Model	HermesAgent-20	HumanEval base	HumanEval+	Harness failures
UltraQuality 7.61 BPW	`82`	`160/164 = 97.56%`	`154/164 = 93.90%`	`0/164`
Superseded STRIX QUALITY	`78`	`161/164 = 98.17%`	`155/164 = 94.51%`	`0/164`
Unsloth Q6 comparison	not rerun in refresh	`160/164 = 97.56%`	`153/164 = 93.29%`	`0/164`

The important result is the combined shape: UltraQuality keeps Q6-class coding behavior, beats the old STRIX QUALITY row on HermesAgent-20, and reduces KLD drift versus the historical quality recipe.

Recipe Notes

UltraQuality is the user-facing name for the ranked leave-32 ROCmFPX recipe from the current tuning pass. The local build artifact was the attention-rank-leave32/Q6K-splice candidate, promoted here under the clean public name:

Qwable 27B Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

The older STRIX QUALITY recipe used broad Q6/Q8 promotion and was good enough to show the quality direction, but it had bad served-MTP draft behavior in the refresh. UltraQuality protects the tensors that mattered more surgically, which is why its KLD tail and MTP acceptance recovered at the same time.

Run With ROCmFPX

Build or use a ROCmFPX-aware llama.cpp runner, then launch the default UltraQuality file with the served MTP profile below.

HSA_OVERRIDE_GFX_VERSION=11.5.1 \
GGML_HIP_ENABLE_UNIFIED_MEMORY=1 \
./llama-server \
  -m /models/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW.gguf \
  --alias qwable-27b-chadrock-rocmfpx-ultraquality-7p61bpw \
  --host 127.0.0.1 \
  --port 8080 \
  --jinja \
  -c 65536 \
  -ngl 999 \
  -fa on \
  -dev ROCm0 \
  -sm none \
  -b 2048 \
  -ub 512 \
  -t 16 \
  -tb 32 \
  -ctk q8_0 \
  -ctv q8_0 \
  --ctx-checkpoints 0 \
  --checkpoint-every-n-tokens -1 \
  --spec-type draft-mtp \
  --spec-draft-device ROCm0 \
  --spec-draft-ngl all \
  --spec-draft-type-k f16 \
  --spec-draft-type-v f16 \
  --spec-draft-n-max 6 \
  --spec-draft-n-min 0 \
  --spec-draft-p-min 0.0 \
  --spec-draft-p-split 0.20 \
  --parallel 1 \
  --metrics \
  --no-mmproj \
  --no-context-shift \
  --reasoning off \
  --reasoning-format none \
  --reasoning-budget 0 \
  --temp 0 \
  --top-p 0.95 \
  --top-k 20 \
  --repeat-penalty 1.0 \
  --seed 123

A matching profile is included at:

profiles/qwable-27b-chadrock-rocmfpx-ultraquality-7p61bpw-rocm-mtp.env

Checksums

File	SHA256
`Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW.gguf`	`14cb3fb0670163a1b0f73c5df521ce0513cfddd7609d75d0640d00a07537073e`

Evidence

Local refresh artifacts used for this card:

speed: card-refresh-20260629 served MTP refresh
quality: WikiText PPL/KLD file refresh, HermesAgent-20, EvalPlus HumanEval+

The public names intentionally hide the internal recipe filenames. The internal UltraQuality source artifact was the ranked leave-32/Q6K-splice GGUF from the ROCmFPX tuning run.

Limitations

This is specifically tuned and measured for AMD Strix Halo / Ryzen AI Max+ 395 with ROCm.
Stock upstream llama.cpp is not enough; use a ROCmFPX-aware runner.
The headline speed row is a 20KB served-MTP prompt refresh, not a full long-context sweep.
UMA memory reporting on this platform does not map cleanly to a simple discrete-GPU VRAM number, so this card uses file size and BPW as the public size metrics.

Credits

Qwen: Qwen3.6 base model family.
Unsloth: Qwen3.6 27B MTP GGUF source lineage.
Charlie / ROCmFPX: ROCmFPX tensor formats and llama.cpp runtime work.
Ciru Inference Lab: ROCmFPX recipe tuning, Strix Halo benchmarking, and model-card validation.

Downloads last month: 441

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for jcbtc/Qwable-27B-Chadrock-ROCmFPX-ULTRAQUALITY-7.61BPW

Base model

Qwen/Qwen3.6-27B

Finetuned

unsloth/Qwen3.6-27B

Quantized

(10)

this model