Instructions to use Shamima/babylm-2026-multilingual-uniform-100M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Shamima/babylm-2026-multilingual-uniform-100M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Shamima/babylm-2026-multilingual-uniform-100M")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Shamima/babylm-2026-multilingual-uniform-100M")
model = AutoModelForMultimodalLM.from_pretrained("Shamima/babylm-2026-multilingual-uniform-100M")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Shamima/babylm-2026-multilingual-uniform-100M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Shamima/babylm-2026-multilingual-uniform-100M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Shamima/babylm-2026-multilingual-uniform-100M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Shamima/babylm-2026-multilingual-uniform-100M

SGLang

How to use Shamima/babylm-2026-multilingual-uniform-100M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Shamima/babylm-2026-multilingual-uniform-100M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Shamima/babylm-2026-multilingual-uniform-100M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Shamima/babylm-2026-multilingual-uniform-100M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Shamima/babylm-2026-multilingual-uniform-100M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Shamima/babylm-2026-multilingual-uniform-100M with Docker Model Runner:
```
docker model run hf.co/Shamima/babylm-2026-multilingual-uniform-100M
```

BabyLM 2026 — MultiLingual track baseline (byte-premium-uniform)

A 110M-param Llama-style decoder pre-trained from scratch on the BabyBabelLM trilingual corpus (English, Dutch, Chinese), under the BabyLM 2026 MultiLingual track rules: 100M reference tokens, byte-premium adjusted, ≤10 epochs.

This is the baseline zero-point of our ablation grid. Subsequent runs vary the mixture allocation (loss-weighted, simultaneous-bilingual, typological-bridge curriculum, register-controlled) on top of an identical scaffold. The matching ablation paper is in preparation.

Architecture

Llama (HF LlamaForCausalLM) — RoPE, RMSNorm, SwiGLU, no biases, tied embeddings
12 layers · 768 hidden · 12 heads · 2048 FFN
1024 sequence length
110,119,680 parameters

Tokenizer

Joint byte-level BPE, 32,768 vocab, trained on a balanced 50M-char sample from each of EN/NL/ZH. The same tokenizer is shared across all three languages (see the data card for why a joint tokenizer is required: ZH is 6.8% Latin script).

Training

Data: BabyLM-community/babylm-eng + babylm-nld + babylm-zho (BabyBabelLM 2026 100M tier). Full corpora loaded in memory and shuffled (the Hub layout is category-clustered; streaming with reasonable buffers produces a biased sample).
Mixture: byte-premium-uniform — equal share of reference tokens per language (1/3 each), achieved by deficit-driven selection, not uniform doc sampling (mean doc sizes differ across languages).
Optimizer: AdamW (β₁=0.9, β₂=0.95, wd=0.1), lr 6e-4, cosine to 10%, 100-step warmup
Compute: 4× NVIDIA A10G (23 GB), bf16, DDP, micro-batch 16 × grad-accum 2 (eff. batch 128 sequences = 131k tokens/step)
Tokens consumed at this checkpoint: 100,000,000 byte-premium-adjusted reference tokens
Per-language epochs at this checkpoint: ≈1.0 each (within the BabyLM ≤10-epoch cap)

Revisions

The chck_{N}M revisions match the BabyLM eval pipeline's fast-eval naming:

chck_1M, chck_2M, ..., chck_9M, chck_10M, chck_20M, ..., chck_90M, chck_100M

Use revision=chck_NM to load any milestone. The default (main) is chck_100M.

How to evaluate

git clone https://github.com/babylm-org/babylm-eval
cd babylm-eval/multilingual
bash scripts/zeroshot_model.sh --model_name Shamima/babylm-2026-multilingual-uniform-100M
bash scripts/zeroshot_model_fast_all.sh --model_name Shamima/babylm-2026-multilingual-uniform-100M

Citation

@misc{babylm-2026-uniform,
  title  = {BabyLM 2026 MultiLingual baseline (byte-premium-uniform)},
  author = {Hossain, Shamima},
  year   = {2026},
  url    = {https://huggingface.co/Shamima/babylm-2026-multilingual-uniform-100M}
}

Companion repo with audit, scaffold, and ablation configs: https://github.com/silvererudite/bb-lm-challenge-sub

Downloads last month: 206

Safetensors

Model size

0.1B params

Tensor type

F32