Instructions to use RMDWLLC/kaiju-coder-7 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RMDWLLC/kaiju-coder-7 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="RMDWLLC/kaiju-coder-7")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("RMDWLLC/kaiju-coder-7")
model = AutoModelForImageTextToText.from_pretrained("RMDWLLC/kaiju-coder-7")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use RMDWLLC/kaiju-coder-7 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "RMDWLLC/kaiju-coder-7"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RMDWLLC/kaiju-coder-7",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/RMDWLLC/kaiju-coder-7

SGLang

How to use RMDWLLC/kaiju-coder-7 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "RMDWLLC/kaiju-coder-7" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RMDWLLC/kaiju-coder-7",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "RMDWLLC/kaiju-coder-7" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RMDWLLC/kaiju-coder-7",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use RMDWLLC/kaiju-coder-7 with Docker Model Runner:
```
docker model run hf.co/RMDWLLC/kaiju-coder-7
```

Kaiju Coder 7 by Kiyomi

Kaiju Coder 7 by Kiyomi is an RMDW release model for practical local coding, business-owner build work, and OpenCode-assisted artifact generation.

Model Summary

Kaiju Coder 7 by Kiyomi is an RMDW fine-tuned coding and builder model for solo entrepreneurs and local-first AI users.

Primary intended work:

Build complete websites and landing pages.
Build Kiyomi-style AI-company launch packs for business owners.
Write scripts, small apps, and automation flows.
Reason about Stripe, licensing, auth proxies, and release workflows.
Draft practical business documents such as proposals, launch plans, support notes, operator handbooks, and follow-up sequences.
Produce intake/CRM schemas, lead lists, ROI dashboards, and reporting artifacts.
Help builders avoid overbuilt architecture and ship useful artifacts.

Base Model

Base candidate: Qwen/Qwen3.6-27B
Base model URL: https://huggingface.co/Qwen/Qwen3.6-27B
Checked revision: 6a9e13bd6fc8f0983b9b99948120bc37f49c13e9
License tag checked: apache-2.0 on 2026-06-03
Upstream license copy: release/upstream/qwen3.6-27b/LICENSE
Upstream license check: release/UPSTREAM_LICENSE_CHECK.md

Required before release:

Include upstream Apache 2.0 license.
Include upstream notices if present.
Do not imply Qwen or Alibaba endorsement.
Use attribution language only: "Fine-tuned from Qwen under Apache 2.0."

Fine-Tuning

Method: LoRA
Existing full run lineage: qwen36-27b-lora-v0.1 through current Kaiju adapters
Training hardware: Gojira B, 128GB NVIDIA Spark
v0.1 training examples: 575 reviewed examples
v1.7 training file: datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl
v1.7 raw reviewed examples: 1,689
v1.7 training rows after business-owner oversampling: 1,881
v1.7 business-owner addendum: 8 reviewed examples, oversampled 24 times for the next run
v1.7 config: training/configs/qwen36-27b-lora-v1.7.example.json
v1.7 run scope: 1,024-token context, 24 steps, intended as a testable business-owner adapter rather than a final long-context bakeoff
v1.7 train runtime: 1663.7101s
v1.7 train loss: 1.7260706673065822
v1.7 train/eval examples: 1,769 / 112
v1.7 adapter path: runs/qwen36-27b-lora-v1.7-business-owner/adapter
v1.8 config: training/configs/qwen36-27b-lora-v1.8-business-owner.example.json
v1.8 scope: 2,048-token context, 96 max steps, same reviewed/oversampled v1.7 business-owner SFT rows
v1.8 train runtime: 11666.7564s
v1.8 train loss: 0.9281658741335074
v1.8 train/eval examples: 1,769 / 112
v1.8 adapter path: runs/qwen36-27b-lora-v1.8-business-owner/adapter
v1.8 status: completed on 2026-06-03 and merged into a full local model for serving; do not publish externally until human review, upstream notices, broader comparison evals, and raw website limitation language are complete
Trainable parameters: approximately 79.7M
Base parameters: approximately 27.0B
Merged full-model artifact: /home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged, 51G, 14 safetensor shards plus tokenizer/config sidecars

Release note: Kaiju's current product path may combine a compact model planner with deterministic harnesses and verifier checks. If the shipped experience uses that harness path, release copy must say so plainly instead of implying the raw model weights alone create every artifact.

Data

The dataset is source-backed and RMDW-owned or RMDW-authored. The current source inventory is tracked in release/SOURCE_INVENTORY.md.

High-level categories:

Website/UI
Coding
Debugging
Automation
Tool-use
Strategy
Business
Business-suite
Identity

Excluded data:

Closed-model outputs from OpenAI, Anthropic, Gemini, or similar providers as supervised training completions.
Customer private code without explicit permission.
Client-specific website text, contact details, contracts, or private business details unless explicitly reviewed and approved.
Secrets, credentials, private keys, tokens, cookies, and raw support logs containing personal data.

Evaluation

Required bakeoff before release:

Base Qwen 3.6 27B
Kaiju Coder LoRA
GLM 4.7 production baseline

Current local harness evidence:

2026-06-03 Kiyomi business-suite router hard gate: 23/23 passed.
Business-suite prompts: 2/2 passed.
Static artifact checks: 23/23 passed.
Dataset validation: 1,689 reviewed candidate examples across 14 files.
v1.7 target gate: all category minimums met, including business_suite and proposal.
v1.7 served adapter smoke:
- Website task website-barber-001: passed, 2,726 chars in 174.49s.
- Proposal task proposal-001 with Kaiju API system prompt: passed, 4,306 chars in 232.27s.
v1.7 serving config: SGLang over Tailscale at http://100.109.109.14:18083/v1, model kaiju_v17_business_owner, context 4096, memory fraction 0.90.
v1.8 training metrics: runtime 11666.7564s, train loss 0.9281658741335074, adapter present.
v1.8 dynamic SGLang LoRA caveat:
- Adapter-name-only serving can be base-equivalent.
- Corrected selector qwen36-27b:kaiju_v18_business_owner crashes with LoRA buffer shape torch.Size([8192, 16]) does not match weight shape torch.Size([14336, 16]).
- Dynamic LoRA is not the release serving path for this checkpoint.
Kaiju Coder 7 current serving config: vLLM bitsandbytes runtime quantization on Gojira B at http://100.109.109.14:18084/v1, exposed on this Mac through http://127.0.0.1:18181/v1, model kaiju-coder-7, current OpenCode context 16384. SGLang has historical 32k benchmark evidence, but 32k should be freshly restarted and re-confirmed before being called the live default.
v1.8 merged endpoint probe: 1,155 visible chars in 60.17s.
v1.8 merged focused eval:
- Proposal rerun: 1/1 paid-ready, 4.0/4.0, 4,014 chars in 212.72s.
- Jah credits backend: 4.0/4.0, 9,718 chars in 566.36s.
Broader base-Qwen, GLM, and raw website comparisons are still pending before any superiority claims.

Sellable-candidate gate:

Beats base Qwen on RMDW practical evals.
Near or above GLM 4.7 on highest-value customer tasks.
No critical safety failures.
Produces complete artifacts instead of plans only.
Produces owner-ready Kiyomi/RMDW artifacts for websites, connector packs, CRM, reporting, leads, sales, ROI, and operator training.
Distinct useful voice without becoming gimmicky.

Limitations

Known limitations:

Not a general frontier model.
May be weaker than large cloud frontier models on broad reasoning and uncommon programming domains.
Needs a strong harness for tool use, file editing, and long-running work.
Raw merged serving is slow on this SGLang stack.
Dynamic SGLang LoRA serving is not release-quality for this adapter; use the merged model path.
Business-owner performance depends on source-backed evals, provenance controls, and deterministic artifact verification.
Hosted API release requires billing, rate limits, abuse controls, logs, and rollback.

Intended Use

Good fit:

Solo-founder product work.
Small-business websites and automations.
Kiyomi-style local AI product workflows.
Practical coding and deployment assistance.

Not a fit:

High-risk medical, legal, financial, or safety-critical decisions without expert review.
Secret handling without a secure app layer.
Claims of guaranteed correctness.

Release Status

Current status: business-owner release-candidate preparation.

Fresh v1.7 and v1.8 LoRA training finished on 2026-06-03 after clearing old ComfyUI/Ollama workloads from Gojira B. The current completed testable product path is the v1.8 merged model plus the deterministic business-owner harness and verifier. Raw merged model testing works for focused business-owner documents and backend automations, but the paid website path remains harness-first until broader raw website evals pass.

Do not publish weights or sell hosted API access until the eval and release checklist pass.

Downloads last month: 74

Safetensors

Model size

27B params

Tensor type

BF16