|
wandb run: https://wandb.ai/eleutherai/pythia-rlhf/runs/e0drjcsz?workspace=user-yongzx |
|
|
|
Model Evals: |
|
| Task |Version|Filter| Metric |Value | |Stderr| |
|
|-------------|-------|------|--------|-----:|---|-----:| |
|
|arc_challenge|Yaml |none |acc |0.1877|± |0.0114| |
|
| | |none |acc_norm|0.2372|± |0.0124| |
|
|arc_easy |Yaml |none |acc |0.4390|± |0.0102| |
|
| | |none |acc_norm|0.4082|± |0.0101| |
|
|logiqa |Yaml |none |acc |0.1889|± |0.0154| |
|
| | |none |acc_norm|0.2473|± |0.0169| |
|
|piqa |Yaml |none |acc |0.6213|± |0.0113| |
|
| | |none |acc_norm|0.6279|± |0.0113| |
|
|sciq |Yaml |none |acc |0.7230|± |0.0142| |
|
| | |none |acc_norm|0.6840|± |0.0147| |
|
|winogrande |Yaml |none |acc |0.5162|± |0.0140| |
|
|lambada_openai|Yaml |none |perplexity|58.9478|± |2.7662| |
|
| | |none |acc | 0.2602|± |0.0061| |
|
|