|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr| | |
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:| | |
|gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.6179|± |0.0134| | |
| | |strict-match | 5|exact_match|↑ |0.6171|± |0.0134| | |
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr| | |
|----------------|------:|------|-----:|--------|---|-----:|---|------| | |
|kobest_boolq | 1|none | 5|acc |↑ |0.7664|± |0.0113| | |
| | |none | 5|f1 |↑ |0.7662|± | N/A| | |
|kobest_copa | 1|none | 5|acc |↑ |0.5620|± |0.0157| | |
| | |none | 5|f1 |↑ |0.5612|± | N/A| | |
|kobest_hellaswag| 1|none | 5|acc |↑ |0.3840|± |0.0218| | |
| | |none | 5|acc_norm|↑ |0.4900|± |0.0224| | |
| | |none | 5|f1 |↑ |0.3807|± | N/A| | |
|kobest_sentineg | 1|none | 5|acc |↑ |0.5869|± |0.0247| | |
| | |none | 5|f1 |↑ |0.5545|± | N/A| | |
|kobest_wic | 1|none | 5|acc |↑ |0.4952|± |0.0141| | |
| | |none | 5|f1 |↑ |0.4000|± | N/A| |