prabhasa-b_s — Prabhāsa (Pāṇinian Structured pretraining for Small LMs)

BabyLM 2026 Strict (100M words) submission. ELC-PSALM encoder (RoPE, Vidyut N-hot morpheme embeddings, kāraka-aware masking, Muon optimizer), pure/hybrid MLM. Load with trust_remote_code=True (AutoModelForMaskedLM).

Results

  • BLiMP-PLL: 73.06 (single seed). Text-Avg 55.99 (>~54 baseline); BLiMP-supplement 67.46 (+2.46), entity-tracking 33.26 (+9.68).

Honest findings (pre-registered, controlled)

  • F1: the objective effect is scale-dependent (pure-MLM wins at 100M, neutral at 10M).
  • F2: kāraka masking is causally null at matched budget (ΔK−C +0.10, ns).
  • F3: kāraka auxiliary objective gives no significant BLiMP lift (5-seed Δ +0.76, ns).
  • Robust wins = architecture (RoPE) + objective (pure-MLM). The Pāṇinian mechanisms contribute interpretability and design, not a measured BLiMP gain. Code: github.com/SharathSPhD/prabhasa-babylm
from transformers import AutoModelForMaskedLM, AutoTokenizer
m = AutoModelForMaskedLM.from_pretrained("qbz506/prabhasa-b_s", trust_remote_code=True)
t = AutoTokenizer.from_pretrained("qbz506/prabhasa-b_s")
Downloads last month
1,143
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support