SmolGuru-80M-512

This public repo is being prepared for an ~80M parameter LLaMA-style chatbot trained from scratch with the HuggingFaceTB/SmolLM2-135M-Instruct tokenizer.

Status: scaffold/config repo created. Trained weights will be uploaded after pretraining + SFT complete.

Planned training:

  • 2B high-grade pretraining tokens
  • 75k filtered SFT examples
  • 512 context
  • 256-token answer target
  • Final + best checkpoint retained
Downloads last month
1,186
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support