RitsuGPT

A small, from-scratch GPT in pure Rust — it trains on a single consumer GPU (an NVIDIA GeForce RTX 5060, 8 GB) and runs on your own computer. nanoGPT, in Rust.

Trainer & source code: github.com/NeonixLabs/RitsuGPT · Part of Neonix Labs.

What it is, honestly: a ~16.9M-parameter small language model in the spirit of TinyStories (Eldan & Li, 2023). It learns to write simple, coherent short English stories. It is not a production assistant — no world knowledge, no reasoning, no instruction following. Its value is a clean, hackable, from-scratch stack you can train and verify yourself.

Files

File	What
`ritsu-step25000.mpk`	Weights at 25,000 steps (recommended) — `burn` CompactRecorder format
`ritsu-step12000.mpk`	Weights at 12,000 steps (earlier checkpoint)
`tokenizer.json`	Byte-level BPE tokenizer (vocab 8192), HuggingFace `tokenizers` format

Results

Evaluation reports bits-per-byte (BPB) on the TinyStories validation set — tokenizer-invariant, lower is better.

Checkpoint	Steps	BPB
`ritsu-step12000.mpk`	12,000	0.695
`ritsu-step25000.mpk`	25,000	0.6843
byte-level baseline	—	0.805

How to run

This is a Rust / burn model — not a transformers model — so there is no hosted inference widget. Run it locally with the trainer:

git clone https://github.com/NeonixLabs/RitsuGPT
cd RitsuGPT
# put ritsu-step25000.mpk and tokenizer.json in this folder (download them from this repo)
cargo run --release --bin neonix-train -- sample ./ritsu-step25000 ./tokenizer.json "Once upon a time" 200 0.8 40

Pass the checkpoint path without the .mpk suffix — the loader appends it. Inference runs on CPU.

Architecture

A standard decoder-only Transformer, optimized in Rust.

License

MIT. Trained on the public TinyStories dataset.

Downloads last month: -; Downloads are not tracked for this model. How to track

NeonixLabs
/

RitsuGPT