Instructions to use RMDWLLC/kaiju-coder-7 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use RMDWLLC/kaiju-coder-7 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="RMDWLLC/kaiju-coder-7") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("RMDWLLC/kaiju-coder-7") model = AutoModelForImageTextToText.from_pretrained("RMDWLLC/kaiju-coder-7") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use RMDWLLC/kaiju-coder-7 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "RMDWLLC/kaiju-coder-7" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RMDWLLC/kaiju-coder-7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/RMDWLLC/kaiju-coder-7
- SGLang
How to use RMDWLLC/kaiju-coder-7 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "RMDWLLC/kaiju-coder-7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RMDWLLC/kaiju-coder-7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "RMDWLLC/kaiju-coder-7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RMDWLLC/kaiju-coder-7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use RMDWLLC/kaiju-coder-7 with Docker Model Runner:
docker model run hf.co/RMDWLLC/kaiju-coder-7
Kaiju Coder 7 by Kiyomi
Kaiju Coder 7 by Kiyomi is an RMDW release model for practical local coding, business-owner build work, and OpenCode-assisted artifact generation.
Model Summary
Kaiju Coder 7 by Kiyomi is an RMDW fine-tuned coding and builder model for solo entrepreneurs and local-first AI users.
Primary intended work:
- Build complete websites and landing pages.
- Build Kiyomi-style AI-company launch packs for business owners.
- Write scripts, small apps, and automation flows.
- Reason about Stripe, licensing, auth proxies, and release workflows.
- Draft practical business documents such as proposals, launch plans, support notes, operator handbooks, and follow-up sequences.
- Produce intake/CRM schemas, lead lists, ROI dashboards, and reporting artifacts.
- Help builders avoid overbuilt architecture and ship useful artifacts.
Base Model
- Base candidate:
Qwen/Qwen3.6-27B - Base model URL:
https://huggingface.co/Qwen/Qwen3.6-27B - Checked revision:
6a9e13bd6fc8f0983b9b99948120bc37f49c13e9 - License tag checked:
apache-2.0on 2026-06-03 - Upstream license copy:
release/upstream/qwen3.6-27b/LICENSE - Upstream license check:
release/UPSTREAM_LICENSE_CHECK.md
Required before release:
- Include upstream Apache 2.0 license.
- Include upstream notices if present.
- Do not imply Qwen or Alibaba endorsement.
- Use attribution language only: "Fine-tuned from Qwen under Apache 2.0."
Fine-Tuning
- Method: LoRA
- Existing full run lineage:
qwen36-27b-lora-v0.1through current Kaiju adapters - Training hardware: Gojira B, 128GB NVIDIA Spark
- v0.1 training examples: 575 reviewed examples
- v1.7 training file:
datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl - v1.7 raw reviewed examples: 1,689
- v1.7 training rows after business-owner oversampling: 1,881
- v1.7 business-owner addendum: 8 reviewed examples, oversampled 24 times for the next run
- v1.7 config:
training/configs/qwen36-27b-lora-v1.7.example.json - v1.7 run scope: 1,024-token context, 24 steps, intended as a testable business-owner adapter rather than a final long-context bakeoff
- v1.7 train runtime:
1663.7101s - v1.7 train loss:
1.7260706673065822 - v1.7 train/eval examples:
1,769/112 - v1.7 adapter path:
runs/qwen36-27b-lora-v1.7-business-owner/adapter - v1.8 config:
training/configs/qwen36-27b-lora-v1.8-business-owner.example.json - v1.8 scope: 2,048-token context, 96 max steps, same reviewed/oversampled v1.7 business-owner SFT rows
- v1.8 train runtime:
11666.7564s - v1.8 train loss:
0.9281658741335074 - v1.8 train/eval examples:
1,769/112 - v1.8 adapter path:
runs/qwen36-27b-lora-v1.8-business-owner/adapter - v1.8 status: completed on 2026-06-03 and merged into a full local model for serving; do not publish externally until human review, upstream notices, broader comparison evals, and raw website limitation language are complete
- Trainable parameters: approximately 79.7M
- Base parameters: approximately 27.0B
- Merged full-model artifact:
/home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged,51G,14safetensor shards plus tokenizer/config sidecars
Release note: Kaiju's current product path may combine a compact model planner with deterministic harnesses and verifier checks. If the shipped experience uses that harness path, release copy must say so plainly instead of implying the raw model weights alone create every artifact.
Data
The dataset is source-backed and RMDW-owned or RMDW-authored. The current source inventory is tracked in release/SOURCE_INVENTORY.md.
High-level categories:
- Website/UI
- Coding
- Debugging
- Automation
- Tool-use
- Strategy
- Business
- Business-suite
- Identity
Excluded data:
- Closed-model outputs from OpenAI, Anthropic, Gemini, or similar providers as supervised training completions.
- Customer private code without explicit permission.
- Client-specific website text, contact details, contracts, or private business details unless explicitly reviewed and approved.
- Secrets, credentials, private keys, tokens, cookies, and raw support logs containing personal data.
Evaluation
Required bakeoff before release:
- Base Qwen 3.6 27B
- Kaiju Coder LoRA
- GLM 4.7 production baseline
Current local harness evidence:
- 2026-06-03 Kiyomi business-suite router hard gate:
23/23passed. - Business-suite prompts:
2/2passed. - Static artifact checks:
23/23passed. - Dataset validation:
1,689reviewed candidate examples across14files. - v1.7 target gate: all category minimums met, including
business_suiteandproposal. - v1.7 served adapter smoke:
- Website task
website-barber-001: passed, 2,726 chars in 174.49s. - Proposal task
proposal-001with Kaiju API system prompt: passed, 4,306 chars in 232.27s.
- Website task
- v1.7 serving config: SGLang over Tailscale at
http://100.109.109.14:18083/v1, modelkaiju_v17_business_owner, context4096, memory fraction0.90. - v1.8 training metrics: runtime
11666.7564s, train loss0.9281658741335074, adapter present. - v1.8 dynamic SGLang LoRA caveat:
- Adapter-name-only serving can be base-equivalent.
- Corrected selector
qwen36-27b:kaiju_v18_business_ownercrashes withLoRA buffer shape torch.Size([8192, 16]) does not match weight shape torch.Size([14336, 16]). - Dynamic LoRA is not the release serving path for this checkpoint.
- Kaiju Coder 7 current serving config: vLLM bitsandbytes runtime
quantization on Gojira B at
http://100.109.109.14:18084/v1, exposed on this Mac throughhttp://127.0.0.1:18181/v1, modelkaiju-coder-7, current OpenCode context16384. SGLang has historical 32k benchmark evidence, but 32k should be freshly restarted and re-confirmed before being called the live default. - v1.8 merged endpoint probe:
1,155visible chars in60.17s. - v1.8 merged focused eval:
- Proposal rerun:
1/1paid-ready,4.0/4.0,4,014chars in212.72s. - Jah credits backend:
4.0/4.0,9,718chars in566.36s.
- Proposal rerun:
- Broader base-Qwen, GLM, and raw website comparisons are still pending before any superiority claims.
Sellable-candidate gate:
- Beats base Qwen on RMDW practical evals.
- Near or above GLM 4.7 on highest-value customer tasks.
- No critical safety failures.
- Produces complete artifacts instead of plans only.
- Produces owner-ready Kiyomi/RMDW artifacts for websites, connector packs, CRM, reporting, leads, sales, ROI, and operator training.
- Distinct useful voice without becoming gimmicky.
Limitations
Known limitations:
- Not a general frontier model.
- May be weaker than large cloud frontier models on broad reasoning and uncommon programming domains.
- Needs a strong harness for tool use, file editing, and long-running work.
- Raw merged serving is slow on this SGLang stack.
- Dynamic SGLang LoRA serving is not release-quality for this adapter; use the merged model path.
- Business-owner performance depends on source-backed evals, provenance controls, and deterministic artifact verification.
- Hosted API release requires billing, rate limits, abuse controls, logs, and rollback.
Intended Use
Good fit:
- Solo-founder product work.
- Small-business websites and automations.
- Kiyomi-style local AI product workflows.
- Practical coding and deployment assistance.
Not a fit:
- High-risk medical, legal, financial, or safety-critical decisions without expert review.
- Secret handling without a secure app layer.
- Claims of guaranteed correctness.
Release Status
Current status: business-owner release-candidate preparation.
Fresh v1.7 and v1.8 LoRA training finished on 2026-06-03 after clearing old ComfyUI/Ollama workloads from Gojira B. The current completed testable product path is the v1.8 merged model plus the deterministic business-owner harness and verifier. Raw merged model testing works for focused business-owner documents and backend automations, but the paid website path remains harness-first until broader raw website evals pass.
Do not publish weights or sell hosted API access until the eval and release checklist pass.
- Downloads last month
- 74
