pythia-1b-sft-hh / README.md
usvsnsp's picture
Add Model Evals
220055b
|
raw
history blame
1.02 kB

wandb: https://wandb.ai/eleutherai/pythia-rlhf/runs/6y83ekqy?workspace=user-yongzx

Model Evals

Task Version Filter Metric Value Stderr
arc_challenge Yaml none acc 0.2526 ± 0.0127
none acc_norm 0.2773 ± 0.0131
arc_easy Yaml none acc 0.5791 ± 0.0101
none acc_norm 0.4912 ± 0.0103
lambada_openai Yaml none perplexity 7.0516 ± 0.1979
none acc 0.5684 ± 0.0069
logiqa Yaml none acc 0.2166 ± 0.0162
none acc_norm 0.2919 ± 0.0178
piqa Yaml none acc 0.7176 ± 0.0105
none acc_norm 0.6964 ± 0.0107
sciq Yaml none acc 0.8460 ± 0.0114
none acc_norm 0.7700 ± 0.0133
winogrande Yaml none acc 0.5399 ± 0.0140