prabhasa-b_ss-0.1 — Prabhāsa (Pāṇinian Structured pretraining for Small LMs)

BabyLM 2026 Strict-Small (10M words) submission. ELC-PSALM encoder (RoPE, Vidyut N-hot morpheme embeddings, kāraka-aware masking, Muon optimizer), pure/hybrid MLM. Load with trust_remote_code=True (AutoModelForMaskedLM).

Results

  • BLiMP-PLL: 64.09 (3-seed mean ±0.26). GLUE avg 58.07; Text-Avg 49.86.

Honest findings (pre-registered, controlled)

  • F1: the objective effect is scale-dependent (pure-MLM wins at 100M, neutral at 10M).
  • F2: kāraka masking is causally null at matched budget (ΔK−C +0.10, ns).
  • F3: kāraka auxiliary objective gives no significant BLiMP lift (5-seed Δ +0.76, ns).
  • Robust wins = architecture (RoPE) + objective (pure-MLM). The Pāṇinian mechanisms contribute interpretability and design, not a measured BLiMP gain. Code: github.com/SharathSPhD/prabhasa-babylm
from transformers import AutoModelForMaskedLM, AutoTokenizer
m = AutoModelForMaskedLM.from_pretrained("qbz506/prabhasa-b_ss-0.1", trust_remote_code=True)
t = AutoTokenizer.from_pretrained("qbz506/prabhasa-b_ss-0.1")
Downloads last month
1,423
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support