psinger commited on
Commit
be0e9d2
1 Parent(s): f4ff8cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -164,6 +164,20 @@ Model validation results using [EleutherAI lm-evaluation-harness](https://github
164
  CUDA_VISIBLE_DEVICES=0 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-gm-oasst1-multilang-1024-20b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> eval.log
165
  ```
166
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
 
168
  ## Disclaimer
169
 
 
164
  CUDA_VISIBLE_DEVICES=0 python main.py --model hf-causal-experimental --model_args pretrained=h2oai/h2ogpt-gm-oasst1-multilang-1024-20b --tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq --device cuda &> eval.log
165
  ```
166
 
167
+ | Task |Version| Metric |Value | |Stderr|
168
+ |-------------|------:|--------|-----:|---|-----:|
169
+ |arc_challenge| 0|acc |0.3447|± |0.0139|
170
+ | | |acc_norm|0.3823|± |0.0142|
171
+ |arc_easy | 0|acc |0.6423|± |0.0098|
172
+ | | |acc_norm|0.5913|± |0.0101|
173
+ |boolq | 1|acc |0.6517|± |0.0083|
174
+ |hellaswag | 0|acc |0.5374|± |0.0050|
175
+ | | |acc_norm|0.7185|± |0.0045|
176
+ |openbookqa | 0|acc |0.2920|± |0.0204|
177
+ | | |acc_norm|0.4100|± |0.0220|
178
+ |piqa | 0|acc |0.7655|± |0.0099|
179
+ | | |acc_norm|0.7753|± |0.0097|
180
+ |winogrande | 0|acc |0.6677|± |0.0132|
181
 
182
  ## Disclaimer
183