pythia-2.8b-sft / README.md
usvsnsp's picture
Add Model Evals
5137877
|
raw
history blame
No virus
830 Bytes

wandb run: https://wandb.ai/eleutherai/pythia-rlhf/runs/0c0pmvz8

Task Version Filter Metric Value Stderr
arc_challenge Yaml none acc 0.2961 ± 0.0133
none acc_norm 0.3285 ± 0.0137
arc_easy Yaml none acc 0.6452 ± 0.0098
none acc_norm 0.5678 ± 0.0102
logiqa Yaml none acc 0.2151 ± 0.0161
none acc_norm 0.2857 ± 0.0177
piqa Yaml none acc 0.7508 ± 0.0101
none acc_norm 0.7503 ± 0.0101
sciq Yaml none acc 0.8820 ± 0.0102
none acc_norm 0.8140 ± 0.0123
winogrande Yaml none acc 0.6038 ± 0.0137