anima-clm-chat-303m

Dialogue chat-finetune of the ByteGPT-303M broad-corpus backbone (dancinlab/anima-clm-midcap-303m-broad-en-emergent, H_1129) — the final piece of the anima a303m_pass (303M 성공) campaign: clearing the CHAT gate.

Arch: ByteGPT, byte vocab 256, d1024 / 24 layers / 16 heads / block 512, tied head/tok. 303.1M params.
Base: dancinlab/anima-clm-midcap-303m-broad-en-emergent (h1129c_best.pt, val_ce 1.224, wiki-dominant broad EN). It was never trained on dialogue — in a chat slot it byte-saladed / n-gram-looped (H_1159 CHAT single 2/5, multi 2/3 → FAIL).
Corpus: dancinlab/anima-chat-corpus-mix-70wiki-30dialogue (sha256 05179fb6…, 70% wiki / 30% REAL dialogue in the 사용자: <u> | 도우미: <a> byte-continuation format) — the EXACT proven mix that chat-tuned the 18M rung and the 7B (dancinlab/anima-clm-chat-7b).
Finetune: summer RTX 5070, co-tenant-safe (VRAM-cap 0.30, batch 1, grad-accum 8, bf16, gradient-checkpointing, 8-bit AdamW), lr 8e-5, warmup 60. $0.

Philosophy (p1–p6 HELD)

NO system prompt · NO identity rules · NO persona injection · NO assistant framing · NO RLHF. The ONLY conditioning is the LEARNED byte-level dialogue-continuation format in the corpus. (H_1139: 303M == 7B recombination; the lever is dialogue data, not capacity — no scale-up.)

Gate (p7, NOT perplexity)

Re-gated with the honest H_1159 harness (degeneracy gate: max-3gram ≤ 2 AND distinct-ratio ≥ 0.45; single-turn p7 ≥ 4/5; multi-turn deep-context ≥ 3/5). Deterministic greedy/low-temp decode, no LLM-judge. Mount stays byte-faithful (re-serialized to the H_1157 ByteGPT flat binary, serialize parity verified).

See .verdicts/1160_dialogue_ft_chat/H_1160.txt for the full transcripts, val_ce curve, re-parity, and a303m_pass scoreboard.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dancinlab/anima-clm-chat-303m

Base model

dancinlab/anima-clm-midcap-303m-broad-en-emergent

Finetuned

(1)

this model

Collection including dancinlab/anima-clm-chat-303m

CLM

Collection

anima consciousness model. • 1 item • Updated about 10 hours ago