AI & ML interests
French language models, from-scratch LLM training, open-source NLP
Recent Activity
RODIN
Research Open Deep Intelligence Natively-french
French large language models, built from scratch — solo, on consumer-grade hardware. Des modèles de langage français, construits de zéro — en solo, sur du matériel grand public.
RODIN is an open, reproducible research project: training French-only language models entirely from scratch — custom tokenizer, custom architecture, hand-built data pipeline, pretraining from random weights. No fine-tune, no derivative.
The first release is RODIN-1B (1.24 B parameters, 32 B training tokens), trained by one person on a rented spot B200 and a single RTX 3090.
Models
- 🧠 rodin-1b — base pretrained model
- 💬 rodin-1b-instruct — conversational model (ChatML) + GGUF
Code
- 💻 github.com/rodin-llm/rodin — the complete pipeline: data, tokenizer, pretraining, SFT, export, and spot-GPU orchestration. Fully reproducible, Apache 2.0.
The vision
RODIN-1B is the first model, not the last. The same pipeline scales to larger sizes (3B → 17B); the blocker is compute budget, not capability. If you'd like to see the RODIN family grow — or sponsor a larger model — the door is open: rodin.lab@proton.me
One person, AI assistance openly acknowledged. Pedagogical and frugal by design. Une seule personne, assistance IA assumée. Pédagogique et frugal par choix.