Instructions to use lordx64/Qwable-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lordx64/Qwable-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="lordx64/Qwable-v1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("lordx64/Qwable-v1") model = AutoModelForMultimodalLM.from_pretrained("lordx64/Qwable-v1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use lordx64/Qwable-v1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lordx64/Qwable-v1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lordx64/Qwable-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/lordx64/Qwable-v1
- SGLang
How to use lordx64/Qwable-v1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "lordx64/Qwable-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lordx64/Qwable-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "lordx64/Qwable-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lordx64/Qwable-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use lordx64/Qwable-v1 with Docker Model Runner:
docker model run hf.co/lordx64/Qwable-v1
Distilled Reasoning or Just XML Cosplay?
So let me get this right.
The model is trained on 4,659 rows from one developer’s Claude Code sessions, mostly narrow web and game dev work.
The “reasoning traces” are not clearly verified as real raw reasoning traces.
Some sources apparently had redacted thinking blocks, so the CoT may be reconstructed or added after the fact.
There are no formal evals published.
The tool names do not reliably match the original Claude Code tools, so it may need a wrapper just to behave properly.
And the pitch is still basically: “we distilled Claude level agentic reasoning into Qwen.”
That is not verified reasoning.
That is format imitation vs actual capability.
XML cosplay vs engineering evidence.
You're right on most of these. Almost all of them are stated explicitly in the model card — I'd rather we agree on what this is than have it overclaimed.
Specifically:
- Narrow distribution: yes, "Honest scope" + "Dataset provenance" sections both call this out. ~4,659 rows, one developer's CC sessions across web/game/physics work, plus a Boeing 747 trace. Not broad.
- Reasoning traces verification: also acknowledged.
armand0e/claude-fable-5-claude-codeandvictor/fable-5-boeing-747-trace(the two Fable-5 sources I tried first) had 100% redacted thinking blocks — Anthropic's preview-model IP protection. OnlyGlint-Research/Fable-5-tracesships cleartext CoT, and per Glint's own README they added it themselves post-hoc. That's documented in the provenance chain and the "Note on the other Fable-5 sources" subsection. - No formal evals: pending. Every row in the Evaluation table is
🚧 in progress. Standing rule on this project: blank-until-verified, omit-rather-than-mislead. Numbers when they're real. - Tool names don't bind to Claude Code's inventory: yes — model emits
str_replace_editor,read_fileinstead ofEdit/Read. Called out in "Tool names are not bound to the Claude Code inventory." Downstream consumers define their own tool registry, but yes, this matters.
The pitch is NOT "we distilled Claude-level agentic reasoning into Qwen." The TL;DR explicitly says reasoning ability comes from the Opus 4.7 step in the chain, not from Fable-5. Fable-5 contributes the agentic tool-use axis — system-prompt-conditional, narrower, and the card says so. The TL;DR even spells out the fallback: bare prompts produce markdown code blocks, not XML tool calls.
On "format imitation vs actual capability" — fair distinction, and one I'd answer with evals, not arguments. SWE-bench Lite with an OpenHands harness is in flight (the proxy + runner are committed in the source repo at training/swe_bench/). When that number lands, we'll know if this is engineering evidence or XML cosplay. Until then I won't claim more than the card already does.
If any specific line reads as overclaiming, point at it and I'll tighten it.