eval
Browse files
README.md
CHANGED
@@ -451,3 +451,10 @@ litgpt evaluate --tasks 'hellaswag,gsm8k,truthfulqa_mc2,mmlu,winogrande,arc_chal
|
|
451 |
```bash
|
452 |
litgpt evaluate --tasks 'gsm8k,mathqa' --out_dir 'evaluate-contrain-math/' --batch_size 4 --dtype 'bfloat16' out/contrain/final/
|
453 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
451 |
```bash
|
452 |
litgpt evaluate --tasks 'gsm8k,mathqa' --out_dir 'evaluate-contrain-math/' --batch_size 4 --dtype 'bfloat16' out/contrain/final/
|
453 |
```
|
454 |
+
|
455 |
+
|Tasks |Version| Filter |n-shot| Metric | |Value | |Stderr|
|
456 |
+
|------|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|
457 |
+
|gsm8k | 3|flexible-extract| 5|exact_match|↑ |0.0182|± |0.0037|
|
458 |
+
| | |strict-match | 5|exact_match|↑ |0.0000|± |0.0000|
|
459 |
+
|mathqa| 1|none | 0|acc |↑ |0.2124|± |0.0075|
|
460 |
+
| | |none | 0|acc_norm |↑ |0.2137|± |0.0075|
|