Instructions to use lordx64/Qwable-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lordx64/Qwable-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="lordx64/Qwable-v1")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("lordx64/Qwable-v1")
model = AutoModelForMultimodalLM.from_pretrained("lordx64/Qwable-v1")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use lordx64/Qwable-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lordx64/Qwable-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lordx64/Qwable-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/lordx64/Qwable-v1

SGLang

How to use lordx64/Qwable-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "lordx64/Qwable-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lordx64/Qwable-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "lordx64/Qwable-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lordx64/Qwable-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use lordx64/Qwable-v1 with Docker Model Runner:
```
docker model run hf.co/lordx64/Qwable-v1
```

Distilled Reasoning or Just XML Cosplay?

by SuperSonnix71 - opened 1 day ago

Discussion

SuperSonnix71

1 day ago

So let me get this right.
The model is trained on 4,659 rows from one developer’s Claude Code sessions, mostly narrow web and game dev work.
The “reasoning traces” are not clearly verified as real raw reasoning traces.
Some sources apparently had redacted thinking blocks, so the CoT may be reconstructed or added after the fact.
There are no formal evals published.
The tool names do not reliably match the original Claude Code tools, so it may need a wrapper just to behave properly.
And the pitch is still basically: “we distilled Claude level agentic reasoning into Qwen.”
That is not verified reasoning.
That is format imitation vs actual capability.
XML cosplay vs engineering evidence.

lordx64

Owner 1 day ago

You're right on most of these. Almost all of them are stated explicitly in the model card — I'd rather we agree on what this is than have it overclaimed.

Specifically:

Narrow distribution: yes, "Honest scope" + "Dataset provenance" sections both call this out. ~4,659 rows, one developer's CC sessions across web/game/physics work, plus a Boeing 747 trace. Not broad.
Reasoning traces verification: also acknowledged. armand0e/claude-fable-5-claude-code and victor/fable-5-boeing-747-trace (the two Fable-5 sources I tried first) had 100% redacted thinking blocks — Anthropic's preview-model IP protection. Only Glint-Research/Fable-5-traces ships cleartext CoT, and per Glint's own README they added it themselves post-hoc. That's documented in the provenance chain and the "Note on the other Fable-5 sources" subsection.
No formal evals: pending. Every row in the Evaluation table is 🚧 in progress. Standing rule on this project: blank-until-verified, omit-rather-than-mislead. Numbers when they're real.
Tool names don't bind to Claude Code's inventory: yes — model emits str_replace_editor, read_file instead of Edit/Read. Called out in "Tool names are not bound to the Claude Code inventory." Downstream consumers define their own tool registry, but yes, this matters.

The pitch is NOT "we distilled Claude-level agentic reasoning into Qwen." The TL;DR explicitly says reasoning ability comes from the Opus 4.7 step in the chain, not from Fable-5. Fable-5 contributes the agentic tool-use axis — system-prompt-conditional, narrower, and the card says so. The TL;DR even spells out the fallback: bare prompts produce markdown code blocks, not XML tool calls.

On "format imitation vs actual capability" — fair distinction, and one I'd answer with evals, not arguments. SWE-bench Lite with an OpenHands harness is in flight (the proxy + runner are committed in the source repo at training/swe_bench/). When that number lands, we'll know if this is engineering evidence or XML cosplay. Until then I won't claim more than the card already does.

If any specific line reads as overclaiming, point at it and I'll tighten it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment