PSALM ELC-PSALM-S โ€” arm C (Dyck (k-shuffle) dose)

Small bidirectional ELC-BERT-style encoder trained from scratch under the BabyLM Strict-Small protocol. This is ablation arm C: the stage-one structural dose is Dyck (k-shuffle), trimmed to the same token budget as every other arm over a shared English base, so differences between arms are attributable to dose content under a fixed budget rather than to data volume.

Trained jointly with masked and causal objectives; minimal pairs are scored by Salazar-style pseudo-log-likelihood. The export registers both AutoModel (base encoder, returns last_hidden_state) and AutoModelForMaskedLM, so the official BabyLM (Super)GLUE fine-tuner can load it directly.

from transformers import AutoModelForMaskedLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("qbz506/psalm-arm-c", trust_remote_code=True)
model = AutoModelForMaskedLM.from_pretrained("qbz506/psalm-arm-c", trust_remote_code=True)

See the project site and repository for the method, the seed-replicated results, and the scope statement. This checkpoint is part of a controlled scientific ablation; for the leaderboard-track model see qbz506/psalm-submission.

Downloads last month
10
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support