open-sci-ref-v0.02-1.7b-dclm-1T-4096-rope_theta-100k

1.7B open-sci-ref model trained on DCLM for 1T tokens (sequence length 4096, RoPE theta = 100000).

The main branch holds the final checkpoint (iter 238419). Intermediate checkpoints (iters 58000-238000, every 2000) are available as branches named iter_XXXXXXX.

Evaluation

Final checkpoint on the open-sci-0.01 suite (lm-eval-harness). Metrics collected with oellm collect-results.

Task n-shot Metric Score
arc_challenge 10 acc_norm 0.4761
arc_easy 10 acc_norm 0.7689
boolq 10 acc 0.7664
commonsense_qa 10 acc 0.4676
copa 0 acc 0.8200
hellaswag 10 acc_norm 0.7367
lambada_openai 0 acc 0.7019
mmlu 5 acc 0.4002
openbookqa 0 acc_norm 0.4080
piqa 10 acc_norm 0.7845
social_iqa 0 acc 0.4463
winogrande 0 acc 0.6606
average 0.6198
Downloads last month
836
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support