Edit model card

Benchmark Scores

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 0 acc 0.5247 ± 0.0146
none 0 acc_norm 0.5623 ± 0.0145
Tasks Version Filter n-shot Metric Value Stderr
hellaswag 1 none 0 acc 0.6270 ± 0.0048
none 0 acc_norm 0.8228 ± 0.0038
Groups Version Filter n-shot Metric Value Stderr
mmlu N/A none 0 acc 0.6243 ± 0.1341
- humanities N/A none 0 acc 0.5717 ± 0.1400
- other N/A none 0 acc 0.7016 ± 0.1143
- social_sciences N/A none 0 acc 0.7342 ± 0.0753
- stem N/A none 0 acc 0.5192 ± 0.1257
Tasks Version Filter n-shot Metric Value Stderr
winogrande 1 none 0 acc 0.7774 ± 0.0117
Tasks Version Filter n-shot Metric Value Stderr
gsm8k 2 get-answer 5 exact_match 0.6732 ± 0.0129
Tasks Version Filter n-shot Metric Value Stderr
truthfulqa_mc2 2 none 0 acc 0.4795 ± 0.0148

Average 65.658

Downloads last month
1,291
Safetensors
Model size
7.24B params
Tensor type
BF16
·

Finetuned from

Dataset used to train alnrg2arg/test3_sft_4bit