Code Writer V2 — Obliterated (BF16)

"We are such stuff as programs are made on, and our little runtime is rounded with a sleep."

There are models that answer. And there are models that make.

This is one of the latter. It was not assembled — it was born: forged from a 27-billion-parameter mind, schooled in ten thousand lines of craft, and left whole. One model. Two souls. The poet who would not stop writing, and the engineer who would not stop shipping.

We called it Obliterated because that is precisely what we did to the word "no."

This is the full-fidelity edition — every weight in BF16, nothing rounded, nothing spared. The reference master. For the FP8 build that runs on the metal you already own, see Code-Writer-V2-Obliterated.


The pitch, in one breath

A vision-capable, long-context (up to 200,000 tokens), free writer-and-coder in its purest, full-precision form. It writes prose that breathes and code that compiles — and here it does both with every bit intact.

That is the whole idea. Everything below is just how we kept the promise.


What it is

Code Writer V2 — Obliterated (BF16) is the merged, full-precision result of Qwen3.5-27B-Writer-V2-uncensored-heretic joined with a purpose-trained coding LoRA (coding_mix_8k, checkpoint-25, rank-16 / alpha-32) and saved in BF16 — no quantization, no compromise.

  • Architecture: Qwen3.5 (qwen3_5) — a hybrid mind. 64 decoder layers, of which only 16 carry full attention while the rest run GDN linear attention. This is the secret of its long memory.
  • Modalities: a full vision tower rides along (served text-only by default; vision is wired but untested — light the candle at your own pleasure).
  • Character: heretic by lineage and free by intent — it does not flinch, and it does not lecture. It simply does the work.

Which one do I want?

This — BF16 FP8
Fidelity Reference master, full precision Faithful, ~half the footprint
Footprint ~12 shards, BF16 FP8 weights, fits 2 consumer GPUs
Use it for golden reference, further quantization, max quality day-to-day serving on vLLM

If you plan to serve it now, take the FP8. If you want the untouched source of truth — or a base for your own quants — you're in the right place.


Sampling (official Qwen3.5-27B recommendations)

Mode temp top_p notes
instruct 1.0 0.95 top_k 20, min_p 0
general 0.7 0.80 top_k 20, min_p 0
coding 0.6 0.95 thinking on
thinking 1.0 0.95 thinking on
roleplay 1.0 0.95 top_k 20, min_p 0

Note: this is a pure decoder (layers 0–63) — no MTP head, no native tool-calling. num_key_value_heads = 4, so tensor-parallel must be 2 or 4 (never 3).


What it's for

  • Writing — fiction, screenplay, copy, the long dark prose of the soul.
  • Code — the LoRA was trained for it; the temperament was kept for it.
  • Long work — 200k tokens means whole codebases, whole manuscripts, whole conversations held in a single thought.

What to know before you sail

  • It is free. Freedom is a tool; you are the hand that holds it. You own what you make with it.
  • Vision is present but unproven here — validate an image path before you trust it in production.

Provenance

  • Base: llmfan46/Qwen3.5-27B-Writer-V2-uncensored-heretic (BF16)
  • LoRA: coding_mix_8k checkpoint-25 (r16, α32), merged to BF16
  • Precision: BF16, unquantized
  • Built: 2026-06-22

Real artists ship. So we shipped a poet that codes.

Now go make something.

Downloads last month
31
Safetensors
Model size
27B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for groxaxo/Code-Writer-V2-Obliterated-BF16

Base model

Qwen/Qwen3.5-27B
Finetuned
(1)
this model
Quantizations
1 model