nanochat-d20-sft
A supervised fine-tuned (SFT) version of nanochat-d20, a 896 million parameter GPT-style language model trained from scratch at the Anuradha and Vikas Sinha Department of Data Science housed in the College of Information at the University of North Texas.
This model demonstrates the transformation from a raw base model to a conversational assistant through instruction tuning โ the same technique used to build ChatGPT from GPT-3.
Acknowledgements
Many thanks to Andrej Karpathy for the nanochat repository, which made this entire workshop possible.
What Changed After SFT
The base model knew facts but had no idea how to respond to questions. SFT taught it conversational structure, when to stop, and how to follow instructions.
| Prompt | Base Model | SFT Model |
|---|---|---|
| Why is the sky blue? | Partial answer, repeats forever | Correct Rayleigh scattering explanation |
| What is 5 times 12? | "37, 37, 37..." forever | "60" โ |
| Write a haiku about autumn | Analyzed the word, never wrote it | Wrote autumn poetry |
| What is the capital of France? | Asked itself in an infinite loop | Paris, with context |
| Explain what a transformer is | "Primary coil connected to load..." loop | Coherent structured explanation |
Model Details
| Property | Value |
|---|---|
| Parameters | 896 million |
| Layers (depth) | 20 |
| Attention heads | 10 |
| Embedding size | 1280 |
| Vocabulary size | 32,768 |
| Context length | 2048 tokens |
| Base model | cliffo4567/nanochat-d20 |
| SFT training time | ~59 minutes |
| Hardware | 1x NVIDIA H200 (143GB) |
SFT Training Data Mixture
| Dataset | Size | Purpose |
|---|---|---|
| SmolTalk | 460K rows | General conversations |
| MMLU | 100K rows x 3 epochs | Academic knowledge, multiple choice |
| GSM8K | 8K rows x 4 epochs | Math reasoning |
| SimpleSpelling | 200K rows | Character-level precision |
| SpellingBee | 80K rows | Letter counting tasks |
| Identity conversations | 1K rows x 2 epochs | Model identity |
Benchmark Results
| Benchmark | Base (nanochat-d20) | SFT (this model) |
|---|---|---|
| ChatCORE | 0.2462 | 0.3428 |
| Spelling |
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for cliffo4567/nanochat-d20-sft
Base model
cliffo4567/nanochat-d20