Super Kaiju by RMDW

Super Kaiju is RMDW's frontier-size private coding model, served from RMDW's own hardware as an OpenAI-compatible API. It is the big sibling to Kaiju-Coder: where Kaiju-Coder is the fast, local, fine-tuned builder model, Super Kaiju is a frontier-size open coder run across a two-machine Apple Silicon cluster for the work that wants the biggest model you can self-host.

What it is, honestly

Super Kaiju is Qwen3-Coder-480B-A35B-Instruct (a 480B-parameter, 35B-active mixture-of-experts coder, Apache-2.0) served at 4-bit through MLX, pipeline-parallel across two Apple M3 Ultra Mac Studios wired together over Thunderbolt with RDMA. It is wrapped in RMDW's Kaiju harness and system identity and exposed as a drop-in OpenAI-compatible endpoint.

It is not a fine-tune and it is not claimed to beat the frontier labs. The honest pitch:

Frontier-size, fully private. A 480B coder running on hardware you (RMDW) own, no tokens leaving the building.
Flat-rate, not metered. One monthly price, send the requests your work needs.
Capability over speed. Because it is a frontier-size model on a home cluster, it trades throughput for size: expect roughly 13 tokens/second, slower than the entry Kaiju-Coder. You run Super Kaiju when you want the larger model, not when you want the fastest one.

Model details

Base model: Qwen/Qwen3-Coder-480B-A35B-Instruct (Apache-2.0)
Served quant: mlx-community/Qwen3-Coder-480B-A35B-Instruct-4bit (~270 GB)
Serving: MLX pipeline-parallel across 2× Apple M3 Ultra (256 GB each), JACCL/RDMA over Thunderbolt, custom main-thread inference server
Context: 256K (extendable)
Interface: OpenAI-compatible POST /v1/chat/completions, streaming and non-streaming
License: Apache-2.0, fine-tuned-from / served-from Qwen under Apache 2.0. No Qwen or Alibaba endorsement implied.

Intended use

Building complete websites and apps, wiring Stripe and auth, writing scripts and automations, drafting proposals and invoices, and the rest of the real work that ships a one-person business, on a private endpoint with no per-token meter.

How to use

curl https://api.rmdw.ai/v1/chat/completions \
  -H "Authorization: Bearer rmdw-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"super-kaiju","messages":[{"role":"user","content":"Write a Python function that returns the nth Fibonacci number."}]}'

Built and self-hosted by Richard Echols / RMDW LLC, Atlanta.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RMDWLLC/super-kaiju

Base model

Qwen/Qwen3-Coder-480B-A35B-Instruct

Finetuned

(15)

this model