prabhasa-b_ss-0.1 — Prabhāsa (Pāṇinian Structured pretraining for Small LMs)

BabyLM 2026 Strict-Small (10M words) submission. ELC-PSALM encoder (RoPE, Vidyut N-hot morpheme embeddings, kāraka-aware masking, Muon optimizer), pure/hybrid MLM. Load with trust_remote_code=True (AutoModelForMaskedLM).

Results

BLiMP-PLL: 64.09 (3-seed mean ±0.26). GLUE avg 58.07; Text-Avg 49.86.

Honest findings (pre-registered, controlled)

F1: the objective effect is scale-dependent (pure-MLM wins at 100M, neutral at 10M).
F2: kāraka masking is causally null at matched budget (ΔK−C +0.10, ns).
F3: kāraka auxiliary objective gives no significant BLiMP lift (5-seed Δ +0.76, ns).
Robust wins = architecture (RoPE) + objective (pure-MLM). The Pāṇinian mechanisms contribute interpretability and design, not a measured BLiMP gain. Code: github.com/SharathSPhD/prabhasa-babylm

from transformers import AutoModelForMaskedLM, AutoTokenizer
m = AutoModelForMaskedLM.from_pretrained("qbz506/prabhasa-b_ss-0.1", trust_remote_code=True)
t = AutoTokenizer.from_pretrained("qbz506/prabhasa-b_ss-0.1")

Downloads last month: 1,423

Safetensors

Model size

0.1B params

Tensor type

F32