Instructions to use RMDWLLC/super-kaiju with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use RMDWLLC/super-kaiju with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir super-kaiju RMDWLLC/super-kaiju
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Super Kaiju by RMDW
Super Kaiju is RMDW's frontier-size private coding model, served from RMDW's own hardware as an OpenAI-compatible API. It is the big sibling to Kaiju-Coder: where Kaiju-Coder is the fast, local, fine-tuned builder model, Super Kaiju is a frontier-size open coder run across a two-machine Apple Silicon cluster for the work that wants the biggest model you can self-host.
What it is, honestly
Super Kaiju is Qwen3-Coder-480B-A35B-Instruct (a 480B-parameter, 35B-active mixture-of-experts coder, Apache-2.0) served at 4-bit through MLX, pipeline-parallel across two Apple M3 Ultra Mac Studios wired together over Thunderbolt with RDMA. It is wrapped in RMDW's Kaiju harness and system identity and exposed as a drop-in OpenAI-compatible endpoint.
It is not a fine-tune and it is not claimed to beat the frontier labs. The honest pitch:
- Frontier-size, fully private. A 480B coder running on hardware you (RMDW) own, no tokens leaving the building.
- Flat-rate, not metered. One monthly price, send the requests your work needs.
- Capability over speed. Because it is a frontier-size model on a home cluster, it trades throughput for size: expect roughly 13 tokens/second, slower than the entry Kaiju-Coder. You run Super Kaiju when you want the larger model, not when you want the fastest one.
Model details
- Base model: Qwen/Qwen3-Coder-480B-A35B-Instruct (Apache-2.0)
- Served quant: mlx-community/Qwen3-Coder-480B-A35B-Instruct-4bit (~270 GB)
- Serving: MLX pipeline-parallel across 2× Apple M3 Ultra (256 GB each), JACCL/RDMA over Thunderbolt, custom main-thread inference server
- Context: 256K (extendable)
- Interface: OpenAI-compatible
POST /v1/chat/completions, streaming and non-streaming - License: Apache-2.0, fine-tuned-from / served-from Qwen under Apache 2.0. No Qwen or Alibaba endorsement implied.
Intended use
Building complete websites and apps, wiring Stripe and auth, writing scripts and automations, drafting proposals and invoices, and the rest of the real work that ships a one-person business, on a private endpoint with no per-token meter.
How to use
curl https://api.rmdw.ai/v1/chat/completions \
-H "Authorization: Bearer rmdw-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"super-kaiju","messages":[{"role":"user","content":"Write a Python function that returns the nth Fibonacci number."}]}'
Built and self-hosted by Richard Echols / RMDW LLC, Atlanta.
Model tree for RMDWLLC/super-kaiju
Base model
Qwen/Qwen3-Coder-480B-A35B-Instruct