slimfrikha-tii commited on
Commit
3097aa6
1 Parent(s): d570153

docs(readme): update

Browse files
Files changed (1) hide show
  1. README.md +4 -10
README.md CHANGED
@@ -128,9 +128,9 @@ We report in the following table our internal pipeline benchmarks:
128
  <tr>
129
  <td rowspan="3">Math</td>
130
  <td>GSM8K (5-shot)</td>
131
- <td>19.2</td>
132
- <td>33.7</td>
133
- <td><b>78.8</b></td>
134
  </tr>
135
  <tr>
136
  <td>GSM8k (8-shot, COT)</td>
@@ -145,7 +145,7 @@ We report in the following table our internal pipeline benchmarks:
145
  <td><b>33.1</b></td>
146
  </tr>
147
  <tr>
148
- <td rowspan="6">Reasoning</td>
149
  <td>Arc Challenge (25-shot)</td>
150
  <td>46.6</td>
151
  <td>55.7</td>
@@ -175,12 +175,6 @@ We report in the following table our internal pipeline benchmarks:
175
  <td><b>53.9</b></td>
176
  <td>52.4</td>
177
  </tr>
178
- <tr>
179
- <td>BBH (3-shot, COT)</td>
180
- <td>6.7</td>
181
- <td>21.2</td>
182
- <td><b>69.3</b></td>
183
- </tr>
184
  <tr>
185
  <td rowspan="4">CommonSense Understanding</td>
186
  <td>PIQA (0-shot)</td>
 
128
  <tr>
129
  <td rowspan="3">Math</td>
130
  <td>GSM8K (5-shot)</td>
131
+ <td>78.1</td>
132
+ <td>77.5</td>
133
+ <td><b>79.1</b></td>
134
  </tr>
135
  <tr>
136
  <td>GSM8k (8-shot, COT)</td>
 
145
  <td><b>33.1</b></td>
146
  </tr>
147
  <tr>
148
+ <td rowspan="5">Reasoning</td>
149
  <td>Arc Challenge (25-shot)</td>
150
  <td>46.6</td>
151
  <td>55.7</td>
 
175
  <td><b>53.9</b></td>
176
  <td>52.4</td>
177
  </tr>
 
 
 
 
 
 
178
  <tr>
179
  <td rowspan="4">CommonSense Understanding</td>
180
  <td>PIQA (0-shot)</td>