Update README.md
Browse files
README.md
CHANGED
@@ -164,6 +164,20 @@ Model validation results using [EleutherAI lm-evaluation-harness](https://github
|
|
164 |
CUDA_VISIBLE_DEVICES=0 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-gm-oasst1-multilang-1024-20b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> eval.log
|
165 |
```
|
166 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
167 |
|
168 |
## Disclaimer
|
169 |
|
|
|
164 |
CUDA_VISIBLE_DEVICES=0 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-gm-oasst1-multilang-1024-20b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> eval.log
|
165 |
```
|
166 |
|
167 |
+
| Task |Version| Metric |Value | |Stderr|
|
168 |
+
|-------------|------:|--------|-----:|---|-----:|
|
169 |
+
|arc_challenge| 0|acc |0.3447|± |0.0139|
|
170 |
+
| | |acc_norm|0.3823|± |0.0142|
|
171 |
+
|arc_easy | 0|acc |0.6423|± |0.0098|
|
172 |
+
| | |acc_norm|0.5913|± |0.0101|
|
173 |
+
|boolq | 1|acc |0.6517|± |0.0083|
|
174 |
+
|hellaswag | 0|acc |0.5374|± |0.0050|
|
175 |
+
| | |acc_norm|0.7185|± |0.0045|
|
176 |
+
|openbookqa | 0|acc |0.2920|± |0.0204|
|
177 |
+
| | |acc_norm|0.4100|± |0.0220|
|
178 |
+
|piqa | 0|acc |0.7655|± |0.0099|
|
179 |
+
| | |acc_norm|0.7753|± |0.0097|
|
180 |
+
|winogrande | 0|acc |0.6677|± |0.0132|
|
181 |
|
182 |
## Disclaimer
|
183 |
|