prabhasa-b_s — Prabhāsa (Pāṇinian Structured pretraining for Small LMs)

BabyLM 2026 Strict (100M words) submission. ELC-PSALM encoder (RoPE, Vidyut N-hot morpheme embeddings, kāraka-aware masking, Muon optimizer), pure/hybrid MLM. Load with trust_remote_code=True (AutoModelForMaskedLM).

Results

BLiMP-PLL: 73.06 (single seed). Text-Avg 55.99 (>~54 baseline); BLiMP-supplement 67.46 (+2.46), entity-tracking 33.26 (+9.68).

Honest findings (pre-registered, controlled)

F1: the objective effect is scale-dependent (pure-MLM wins at 100M, neutral at 10M).
F2: kāraka masking is causally null at matched budget (ΔK−C +0.10, ns).
F3: kāraka auxiliary objective gives no significant BLiMP lift (5-seed Δ +0.76, ns).
Robust wins = architecture (RoPE) + objective (pure-MLM). The Pāṇinian mechanisms contribute interpretability and design, not a measured BLiMP gain. Code: github.com/SharathSPhD/prabhasa-babylm

from transformers import AutoModelForMaskedLM, AutoTokenizer
m = AutoModelForMaskedLM.from_pretrained("qbz506/prabhasa-b_s", trust_remote_code=True)
t = AutoTokenizer.from_pretrained("qbz506/prabhasa-b_s")

Downloads last month: 1,143

Safetensors

Model size

0.1B params

Tensor type

F32