RecursiveComplete

A small GPT-2-style language model (~18.3M parameters) trained completely from scratch by an AI, end to end โ€” the architecture, training code, tokenizer, data prep, and training run were all written and executed by an AI agent with no pre-existing weights or fine-tuning from another model.

This is a text-completion model, not an instruction-tuned chatbot. It's good at continuing short prose and simple stories. It is not good at answering questions, following instructions, or factual recall.

Note: this is a custom-format model, not a transformers model. You load it with the included scripts (gpt2.py + chat.py), not AutoModelForCausalLM. It does say GPT-2 in the file names. But that is just because the model used the same architecture style

Model details

Type Decoder-only transformer (GPT-2 style)
Parameters ~18.3M
Embedding dim (n_embd) 448
Heads (n_head) 7
Layers (n_layer) 6
Context length (block_size) 256
Vocab size 8192
Tokenizer Byte-level BPE (<eot> id = 0)
Dropout 0.1
Final train loss ~1.86

Training data

Trained primarily on TinyStories (~90M tokens) with a small amount of Alpaca-style data. The model learned general English sentence structure and simple narrative flow, not world knowledge.

Files in this repo

File What it is
model.safetensors The model weights
config.json Architecture config (custom format)
gpt2.py Model definition (the GPT-2-style architecture)
chat.py Run / generate from the model
tokenizer_bpe/vocab.json, tokenizer_bpe/merges.txt Byte-level BPE tokenizer
big.pt Full training checkpoint (model + optimizer), for resuming training only
train_big.py, prep_bpe.py Training and data-prep scripts

Intended use

  • Story / prose continuation
  • Experimentation and education (a clean, fully-from-scratch small LM)

How to use

This model uses its own minimal code, not the transformers library.

# 1. Install deps
pip install torch tokenizers safetensors numpy

# 2. Download this repo (gives you the scripts + weights + tokenizer)
pip install huggingface_hub
hf download Gentraxyz/RecursiveComplete --local-dir RecursiveComplete
cd RecursiveComplete

# 3. Generate
python chat.py

chat.py loads gpt2.py (the architecture), the weights from model.safetensors, and the BPE tokenizer in tokenizer_bpe/, then lets you prompt the model for completions.

Tip: it's a completion model โ€” give it the start of something ("Once upon a time there was a small robot who") rather than a question.

Limitations

  • Completion only โ€” will not reliably answer questions or follow instructions.
  • No factual reliability; it will confidently make things up.
  • Small context (256 tokens) and small vocab (8192).
  • English only.

License

Apache 2.0.

Note

This model was trained entirely by an AI โ€” including writing the model code, the tokenizer, the data pipeline, and running the training. It is shared as a small from-scratch experiment.

Downloads last month
38
Safetensors
Model size
22.3M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support