File size: 1,271 Bytes
486f043
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|     Tasks      |Version|Filter|n-shot| Metric |   |Value |   |Stderr|
|----------------|------:|------|-----:|--------|---|-----:|---|------|
|kobest_boolq    |      1|none  |     5|acc     |↑  |0.5726|±  |0.0132|
|                |       |none  |     5|f1      |↑  |0.5725|±  |   N/A|
|kobest_copa     |      1|none  |     5|acc     |↑  |0.5200|±  |0.0158|
|                |       |none  |     5|f1      |↑  |0.5189|±  |   N/A|
|kobest_hellaswag|      1|none  |     5|acc     |↑  |0.3640|±  |0.0215|
|                |       |none  |     5|acc_norm|↑  |0.4380|±  |0.0222|
|                |       |none  |     5|f1      |↑  |0.3592|±  |   N/A|
|kobest_sentineg |      1|none  |     5|acc     |↑  |0.5642|±  |0.0249|
|                |       |none  |     5|f1      |↑  |0.5554|±  |   N/A|
|kobest_wic      |      1|none  |     5|acc     |↑  |0.5087|±  |0.0141|
|                |       |none  |     5|f1      |↑  |0.4979|±  |   N/A|

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.2995|±  |0.0126|
|     |       |strict-match    |     5|exact_match|↑  |0.2987|±  |0.0126|