Instructions to use rodin-llm/rodin-1b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rodin-llm/rodin-1b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="rodin-llm/rodin-1b")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("rodin-llm/rodin-1b")
model = AutoModelForCausalLM.from_pretrained("rodin-llm/rodin-1b")

llama-cpp-python

How to use rodin-llm/rodin-1b with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="rodin-llm/rodin-1b",
	filename="rodin-1b-base-Q4_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use rodin-llm/rodin-1b with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf rodin-llm/rodin-1b:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf rodin-llm/rodin-1b:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf rodin-llm/rodin-1b:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf rodin-llm/rodin-1b:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf rodin-llm/rodin-1b:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf rodin-llm/rodin-1b:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf rodin-llm/rodin-1b:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf rodin-llm/rodin-1b:Q4_K_M

Use Docker

docker model run hf.co/rodin-llm/rodin-1b:Q4_K_M

LM Studio
Jan

vLLM

How to use rodin-llm/rodin-1b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rodin-llm/rodin-1b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rodin-llm/rodin-1b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/rodin-llm/rodin-1b:Q4_K_M

SGLang

How to use rodin-llm/rodin-1b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "rodin-llm/rodin-1b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rodin-llm/rodin-1b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "rodin-llm/rodin-1b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rodin-llm/rodin-1b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Ollama
How to use rodin-llm/rodin-1b with Ollama:
```
ollama run hf.co/rodin-llm/rodin-1b:Q4_K_M
```

Unsloth Studio

How to use rodin-llm/rodin-1b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rodin-llm/rodin-1b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for rodin-llm/rodin-1b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for rodin-llm/rodin-1b to start chatting

Atomic Chat new
Docker Model Runner
How to use rodin-llm/rodin-1b with Docker Model Runner:
```
docker model run hf.co/rodin-llm/rodin-1b:Q4_K_M
```

Lemonade

How to use rodin-llm/rodin-1b with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull rodin-llm/rodin-1b:Q4_K_M

Run and chat with the model

lemonade run user.rodin-1b-Q4_K_M

List all available models

lemonade list

Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

RODIN-1B (base)

A French language model trained from scratch — solo, on consumer-grade hardware. Un modele de langage francais entraine de zero — en solo, sur du materiel grand public.

🇬🇧 English · 🇫🇷 Français

This is the base pretrained model. For the conversational, instruction-tuned variant, see rodin-llm/rodin-1b-instruct. Ceci est le modele de base pre-entraine. Pour la variante conversationnelle, voir rodin-llm/rodin-1b-instruct.

💻 Full source code / Code source complet : github.com/rodin-llm/rodin The complete pipeline — data, tokenizer, pretraining, SFT, export, spot orchestration. Le pipeline complet — données, tokenizer, pré-entraînement, SFT, export, orchestration spot.

🇬🇧 English

Overview

RODIN-1B is a 1.24-billion-parameter, French-only causal language model trained from scratch by a single person, as an open and reproducible research project. RODIN stands for Research Open Deep Intelligence Natively-french.

This is not a fine-tune or a derivative of an existing model. Every stage was built end to end: dataset collection and cleaning, a custom French BPE tokenizer, and pretraining.

This repository hosts the base pretrained model: a foundation model, suitable for further fine-tuning or for research on what a from-scratch 1B French model learns. It is not instruction-tuned and will not behave as a chat assistant — for that, use rodin-1b-instruct.

Why this model exists

The goal was never to compete with large, well-funded French models on raw benchmark scores — that would be a losing game by design. For scale: comparable French open-source efforts were trained on 3,000 billion tokens using hundreds of H100 GPUs on national supercomputers. RODIN-1B was trained on 32 billion tokens, by one person, on a rented spot B200 instance plus a single RTX 3090 for local iteration.

The value of RODIN is pedagogical and demonstrative: it shows what one motivated individual can build from scratch, end to end, with a small budget — and it documents every decision honestly, including the limitations.

Model description

Property	Value
Parameters	1.238 B
Architecture	LLaMA-style (RoPE, RMSNorm, SwiGLU, causal SDPA attention)
Hidden size	2048
Layers	22
Attention heads	16 (no GQA, `n_kv_heads = n_heads`)
FFN intermediate size	5461
Vocabulary	64,000 (custom SentencePiece BPE, trained on French)
Context length	2048 tokens
RoPE theta	10,000
RMSNorm eps	1e-5
Weight tying	yes (input/output embeddings tied)
Training dtype	bfloat16

Training data

RODIN-1B was pretrained exclusively on open or public-domain French data. Sources:

Source	Nature	License
HPLT	Web-crawl corpus	CC0 (packaging)
CC100	Web-crawl corpus (Common Crawl derived)	Permissive
Wikipedia (FR)	Encyclopedia	CC BY-SA
Wikisource (FR)	Public-domain texts	CC BY-SA / public domain
Pleias — books	Public-domain / open books	Open / public domain
Pleias — news	Open newspapers	Open / public domain
Légifrance	French legal/public documents	Open license

Honest note on provenance. For web-crawl sources (HPLT, CC100), the open license covers the packaging of the dataset, not an individual guarantee on every underlying document. The Pleias corpora are specifically curated to be uncopyrighted or freely licensed and are the most provenance-safe part of the corpus. Wikipedia and Wikisource are credited here in accordance with their attribution (BY) terms.

Evaluation — FrenchBench

Evaluated with EleutherAI's lm-evaluation-harness, task group french_bench, 3-shot, full test sets (OrangeSum excluded due to a datasets library incompatibility).

Task	Metric	Score
Vocabulary	acc	0.773
Grammar	acc	0.765
Reading comprehension	acc	0.606
BoolQA	acc	0.573
Topic-based NLI	acc_norm	0.498
HellaSwag	acc_norm	0.424
FQuAD v2 (bool)	acc	0.500
XNLI	acc	0.333
ARC challenge	acc_norm	0.265
Trivia	f1	0.245
FQuAD v2 (genq)	f1	0.165
FQuAD v2	f1	0.095

By category: Linguistic competence 0.68 · Reasoning (MCQ) 0.40 · Generative QA 0.16

How to read these results. The standout results are grammar (0.77) and vocabulary (0.77) — the direct payoff of a French-only, quality corpus with a dedicated French tokenizer. Reasoning (ARC, HellaSwag) and exact-match QA are low, which is expected for a 1B model trained on 32B tokens. RODIN's value is not its raw score but its demonstration of from-scratch training under tight constraints.

Intended use & limitations

Intended use. Research, education, a foundation for further fine-tuning, and study of from-scratch French LM training. This is a base model, not a chat assistant.

Limitations.

Size. At 1.24B parameters, world knowledge and reasoning are limited; the model hallucinates on precise facts.
Base model. Not instruction-tuned. It completes text, it does not follow instructions or hold a conversation.
No safety tuning. No refusal or safety data was used. The model may produce harmful, biased, or inappropriate content. Not suitable for unsupervised or public-facing deployment.
French only, context limited to 2048 tokens.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "rodin-llm/rodin-1b"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16).cuda()

prompt = "La photosynthèse est"
ids = tok(prompt, return_tensors="pt").to("cuda")
out = model.generate(**ids, max_new_tokens=80, temperature=0.7, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))

Training procedure (summary)

Pretraining: 478,000 steps, ~32B French tokens, on a rented spot B200 instance (bfloat16).
Tokenizer: custom 64K SentencePiece BPE trained on French data.

License

Released under the Apache 2.0 license.

Acknowledgements & transparency

Carried out by one person, with AI assistance openly acknowledged throughout. Thanks to EleutherAI (evaluation), the HPLT and Pleias teams (data), Wikimedia, and the llama.cpp / Ollama / LM Studio projects.

@misc{rodin1b2026,
  title  = {RODIN-1B: A French Language Model Trained From Scratch},
  author = {RODIN},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/rodin-llm/rodin-1b}}
}

🇫🇷 Français

Présentation

RODIN-1B est un modèle de langage causal uniquement francophone, de 1,24 milliard de paramètres, entraîné de zéro par une seule personne, dans le cadre d'un projet de recherche ouvert et reproductible. RODIN signifie Research Open Deep Intelligence Natively-french.

Ce n'est ni un fine-tune ni un dérivé d'un modèle existant. Chaque étape a été construite de bout en bout : collecte et nettoyage des données, tokenizer BPE français maison, et pré-entraînement.

Ce dépôt héberge le modèle de base pré-entraîné : un modèle de fondation, adapté à un fine-tuning ultérieur ou à l'étude de ce qu'un modèle français de 1B apprend from scratch. Il n'est pas instruction-tuné et ne se comporte pas comme un assistant — pour cela, utilisez rodin-1b-instruct.

Pourquoi ce modèle existe

L'objectif n'a jamais été de rivaliser en score brut avec les gros modèles français bien financés — ce serait perdu d'avance, par construction. Pour situer : des projets français open source comparables ont été entraînés sur 3 000 milliards de tokens avec des centaines de GPU H100 sur des supercalculateurs nationaux. RODIN-1B a été entraîné sur 32 milliards de tokens, par une seule personne, sur une instance B200 spot louée à l'heure et une seule RTX 3090 pour l'itération locale.

La valeur de RODIN est pédagogique et démonstrative : il montre ce qu'une personne motivée peut construire de zéro, de bout en bout, avec un petit budget — et il documente honnêtement chaque décision, limites comprises.

Description du modèle

Propriété	Valeur
Paramètres	1,238 milliard
Architecture	Style LLaMA (RoPE, RMSNorm, SwiGLU, attention causale SDPA)
Dimension cachée	2048
Couches	22
Têtes d'attention	16 (pas de GQA, `n_kv_heads = n_heads`)
Dimension FFN	5461
Vocabulaire	64 000 (BPE SentencePiece maison, entraîné sur du français)
Longueur de contexte	2048 tokens
RoPE theta	10 000
RMSNorm eps	1e-5
Weight tying	oui (embeddings entrée/sortie liés)
Précision d'entraînement	bfloat16

Données d'entraînement

RODIN-1B a été pré-entraîné exclusivement sur des données françaises ouvertes ou du domaine public. Sources :

Source	Nature	Licence
HPLT	Corpus web-crawl	CC0 (packaging)
CC100	Corpus web-crawl (dérivé de Common Crawl)	Permissive
Wikipédia (FR)	Encyclopédie	CC BY-SA
Wikisource (FR)	Textes du domaine public	CC BY-SA / domaine public
Pleias — livres	Livres libres / domaine public	Libre / domaine public
Pleias — presse	Journaux ouverts	Libre / domaine public
Légifrance	Documents juridiques et publics français	Licence ouverte

Note honnête sur la provenance. Pour les sources web-crawl (HPLT, CC100), la licence ouverte couvre le packaging du dataset, pas une garantie individuelle sur chaque document. Les corpus Pleias sont spécifiquement constitués de données non soumises au droit d'auteur ou librement licenciées : c'est la partie la plus sûre du corpus en matière de provenance. Wikipédia et Wikisource sont crédités conformément à leurs clauses d'attribution (BY).

Évaluation — FrenchBench

Évalué avec lm-evaluation-harness d'EleutherAI, groupe french_bench, 3-shot, jeux de test complets (OrangeSum exclu pour incompatibilité de la librairie datasets).

Tâche	Métrique	Score
Vocabulaire	acc	0,773
Grammaire	acc	0,765
Compréhension écrite	acc	0,606
BoolQA	acc	0,573
NLI thématique	acc_norm	0,498
HellaSwag	acc_norm	0,424
FQuAD v2 (bool)	acc	0,500
XNLI	acc	0,333
ARC challenge	acc_norm	0,265
Trivia	f1	0,245
FQuAD v2 (genq)	f1	0,165
FQuAD v2	f1	0,095

Par catégorie : Compétence linguistique 0,68 · Raisonnement (QCM) 0,40 · QA génératif 0,16

Comment lire ces résultats. Les meilleurs résultats sont la grammaire (0,77) et le vocabulaire (0,77) — le bénéfice direct d'un corpus français de qualité avec un tokenizer dédié. Le raisonnement (ARC, HellaSwag) et le QA en correspondance exacte sont faibles, ce qui est attendu pour un 1B entraîné sur 32B tokens. La valeur de RODIN n'est pas son score brut mais sa démonstration d'un entraînement from scratch sous fortes contraintes.

Usage prévu & limites

Usage prévu. Recherche, éducation, base pour un fine-tuning ultérieur, étude de l'entraînement de LM français from scratch. C'est un modèle de base, pas un assistant.

Limites.

Taille. À 1,24B paramètres, connaissances et raisonnement limités ; hallucine sur les faits précis.
Modèle de base. Non instruction-tuné. Il complète du texte, il ne suit pas d'instructions et ne tient pas de conversation.
Aucun safety tuning. Aucune donnée de refus ou de sécurité. Le modèle peut produire des contenus nuisibles, biaisés ou inappropriés. Non adapté à un déploiement public ou non supervisé.
Français uniquement, contexte limité à 2048 tokens.

Utilisation

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "rodin-llm/rodin-1b"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16).cuda()

prompt = "La photosynthèse est"
ids = tok(prompt, return_tensors="pt").to("cuda")
out = model.generate(**ids, max_new_tokens=80, temperature=0.7, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))

Procédure d'entraînement (résumé)

Pré-entraînement : 478 000 steps, ~32B tokens français, sur une instance B200 spot louée (bfloat16).
Tokenizer : BPE SentencePiece 64K maison, entraîné sur des données françaises.

Licence

Publié sous licence Apache 2.0.

Remerciements & transparence

Mené par une seule personne, avec une assistance IA assumée et transparente tout du long. Merci à EleutherAI (évaluation), aux équipes HPLT et Pleias (données), à Wikimedia, et aux projets llama.cpp / Ollama / LM Studio.

@misc{rodin1b2026,
  title  = {RODIN-1B: A French Language Model Trained From Scratch},
  author = {RODIN},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/rodin-llm/rodin-1b}}
}

Downloads last month: 104

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for rodin-llm/rodin-1b

Quantizations

3 models

Datasets used to train rodin-llm/rodin-1b

Evaluation results

Grammar (acc) on FrenchBench
self-reported

0.765
Vocabulary (acc) on FrenchBench
self-reported

0.773