Update README.md
Browse files
README.md
CHANGED
@@ -192,23 +192,23 @@ The training corpus for Nemotron-4-340B-Base consists of English and multilingua
|
|
192 |
|
193 |
#### Overview
|
194 |
|
195 |
-
*5-shot performance.* Language Understanding evaluated using
|
196 |
| Average |
|
197 |
| :------------- |
|
198 |
| 81.1 |
|
199 |
|
200 |
-
*Zero-shot performance.* Evaluated using select datasets from the
|
201 |
| HellaSwag | Winogrande | BBH| ARC-Challenge |
|
202 |
| :------------- | :------------- | :------------- | :------------- |
|
203 |
| 90.53 | 89.50 | 85.44 | 94.28 |
|
204 |
|
205 |
-
*Chain of Thought (CoT)*. Multilingual capabilities evaluated using
|
206 |
|
207 |
| ES Exact Match (%) | JA Exact Match (%) | TH Exact Match (%) |
|
208 |
| :------------- | :------------- | :------------- |
|
209 |
| 68.8 | 69.6 | 68.4 |
|
210 |
|
211 |
-
*Code generation performance*. Evaluated using
|
212 |
| p@1, 0-Shot |
|
213 |
| :------------- |
|
214 |
| 57.3 |
|
|
|
192 |
|
193 |
#### Overview
|
194 |
|
195 |
+
*5-shot performance.* Language Understanding evaluated using Massive Multitask Language Understanding:
|
196 |
| Average |
|
197 |
| :------------- |
|
198 |
| 81.1 |
|
199 |
|
200 |
+
*Zero-shot performance.* Evaluated using select datasets from the LM Evaluation Harness with additions:
|
201 |
| HellaSwag | Winogrande | BBH| ARC-Challenge |
|
202 |
| :------------- | :------------- | :------------- | :------------- |
|
203 |
| 90.53 | 89.50 | 85.44 | 94.28 |
|
204 |
|
205 |
+
*Chain of Thought (CoT)*. Multilingual capabilities evaluated using Multilingual Grade School Math:
|
206 |
|
207 |
| ES Exact Match (%) | JA Exact Match (%) | TH Exact Match (%) |
|
208 |
| :------------- | :------------- | :------------- |
|
209 |
| 68.8 | 69.6 | 68.4 |
|
210 |
|
211 |
+
*Code generation performance*. Evaluated using HumanEval:
|
212 |
| p@1, 0-Shot |
|
213 |
| :------------- |
|
214 |
| 57.3 |
|