psinger commited on
Commit
d141038
·
1 Parent(s): 15a96d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -164,6 +164,21 @@ Model validation results using [EleutherAI lm-evaluation-harness](https://github
164
  CUDA_VISIBLE_DEVICES=0 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-gm-oasst1-en-1024-12b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> eval.log
165
  ```
166
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
 
168
  ## Disclaimer
169
 
 
164
  CUDA_VISIBLE_DEVICES=0 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-gm-oasst1-en-1024-12b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> eval.log
165
  ```
166
 
167
+ | Task |Version| Metric |Value | |Stderr|
168
+ |-------------|------:|--------|-----:|---|-----:|
169
+ |openbookqa | 0|acc |0.3080|± |0.0207|
170
+ | | |acc_norm|0.3980|± |0.0219|
171
+ |boolq | 1|acc |0.5098|± |0.0087|
172
+ |winogrande | 0|acc |0.6622|± |0.0133|
173
+ |arc_easy | 0|acc |0.6435|± |0.0098|
174
+ | | |acc_norm|0.5800|± |0.0101|
175
+ |piqa | 0|acc |0.7704|± |0.0098|
176
+ | | |acc_norm|0.7704|± |0.0098|
177
+ |hellaswag | 0|acc |0.5150|± |0.0050|
178
+ | | |acc_norm|0.6951|± |0.0046|
179
+ |arc_challenge| 0|acc |0.3345|± |0.0138|
180
+ | | |acc_norm|0.3754|± |0.0142|
181
+
182
 
183
  ## Disclaimer
184