Darwin-36B-Opus β VKAE Accelerated
Ready-to-run, VKAE-accelerated serving of Darwin-36B-Opus, VIDRAFT's house 36B Mixture-of-Experts model. Model weights and an optimized serving runtime in a single self-contained container.
VKAE (VIDRAFT Kernel Acceleration Engine) is VIDRAFT's proprietary inference-serving optimization. The acceleration recipe is withheld; only the reproducible results are published here.
Measured performance
NVIDIA B200, single GPU, bf16, same-harness before/after.
| Metric | Baseline | VKAE | Gain |
|---|---|---|---|
| Single-stream throughput | 25.0 tok/s | 280.8 tok/s | 11.2Γ |
| Output quality | reference | preserved | no degradation |
Quick start
docker pull vidraft/darwin36-vkae:281
docker run --gpus all -p 8000:8000 vidraft/darwin36-vkae:281
The container serves an OpenAI-compatible API on port 8000 β point any OpenAI client at http://localhost:8000/v1. A Blackwell (B200) or Hopper (H100/H200) class GPU is recommended.
π¦ Ready-to-use files in this repo:
Dockerfile,docker-compose.yml,run_docker.shβ pull-and-run, no build required.
Links
- Live acceleration leaderboard β VIDraft/vkae
- Docker image β hub.docker.com/r/vidraft/darwin36-vkae
- Collection β FINAL-Bench Β· VKAE Accelerated
About
Darwin-36B-Opus is a VIDRAFT house model (36B Mixture-of-Experts). This card documents VIDRAFT's accelerated serving of the model; the acceleration method is proprietary and not distributed in source form.
Model tree for FINAL-Bench/Darwin-36B-Opus-VKAE
Base model
FINAL-Bench/Darwin-36B-Opus