RitsuGPT
A small, from-scratch GPT in pure Rust β it trains on a single consumer GPU (an NVIDIA GeForce RTX 5060, 8 GB) and runs on your own computer. nanoGPT, in Rust.
Trainer & source code: github.com/NeonixLabs/RitsuGPT Β· Part of Neonix Labs.
What it is, honestly: a ~16.9M-parameter small language model in the spirit of TinyStories (Eldan & Li, 2023). It learns to write simple, coherent short English stories. It is not a production assistant β no world knowledge, no reasoning, no instruction following. Its value is a clean, hackable, from-scratch stack you can train and verify yourself.
Files
| File | What |
|---|---|
ritsu-step25000.mpk |
Weights at 25,000 steps (recommended) β burn CompactRecorder format |
ritsu-step12000.mpk |
Weights at 12,000 steps (earlier checkpoint) |
tokenizer.json |
Byte-level BPE tokenizer (vocab 8192), HuggingFace tokenizers format |
Results
Evaluation reports bits-per-byte (BPB) on the TinyStories validation set β tokenizer-invariant, lower is better.
| Checkpoint | Steps | BPB |
|---|---|---|
ritsu-step12000.mpk |
12,000 | 0.695 |
ritsu-step25000.mpk |
25,000 | 0.6843 |
| byte-level baseline | β | 0.805 |
How to run
This is a Rust / burn model β not a transformers model β so there is no hosted inference widget. Run it locally with the trainer:
git clone https://github.com/NeonixLabs/RitsuGPT
cd RitsuGPT
# put ritsu-step25000.mpk and tokenizer.json in this folder (download them from this repo)
cargo run --release --bin neonix-train -- sample ./ritsu-step25000 ./tokenizer.json "Once upon a time" 200 0.8 40
Pass the checkpoint path without the .mpk suffix β the loader appends it. Inference runs on CPU.
Architecture
A standard decoder-only Transformer, optimized in Rust.
License
MIT. Trained on the public TinyStories dataset.