victormiller
commited on
Commit
•
8a7d167
1
Parent(s):
fe01884
Update README.md
Browse files
README.md
CHANGED
@@ -16,23 +16,22 @@ K2 is a fully transparent large language model on par with Llama 2 - 70B.
|
|
16 |
<center><img src="eval_table_temp.png" alt="eval table"/></center>
|
17 |
|
18 |
## Datasets and Mix
|
19 |
-
| Dataset | Starting Tokens | Multiplier | Total Tokens
|
20 |
| ----------- | ----------- | ----------- | ----------- | ----------- |
|
21 |
-
| dm-math | 4.
|
22 |
-
|
|
23 |
-
|
|
24 |
-
|
|
25 |
-
|
|
26 |
-
|
|
27 |
-
|
|
28 |
-
|
|
29 |
-
|
|
30 |
-
|
|
31 |
-
|
|
32 |
-
|
|
33 |
-
|
|
34 |
-
|
|
35 |
-
| Checkpoint 356[link] | Checkpoint 351[link] | Checkpoint 355[link] | Checkpoint 355[link] |
|
36 |
|
37 |
## First 10 Checkpoints
|
38 |
| Checkpoints | |
|
|
|
16 |
<center><img src="eval_table_temp.png" alt="eval table"/></center>
|
17 |
|
18 |
## Datasets and Mix
|
19 |
+
| Dataset | Starting Tokens | Multiplier | Total Tokens |% of Total |
|
20 |
| ----------- | ----------- | ----------- | ----------- | ----------- |
|
21 |
+
| dm-math | 4.33B | 3x | 13B | 1% |
|
22 |
+
| pubmed-abstracts | 4.77B | 3x | 14.3B | 1.1% |
|
23 |
+
| uspto | 4.77B | 3x | 14.3B | 1.1% |
|
24 |
+
| pubmed-central | 26B | 1x | 26B | 2% |
|
25 |
+
| redpajama.arxiv | 27.3B | 1x | 27.3B | 2.1% |
|
26 |
+
| starcoder.spm | 67.6B | 0.5x | 33.8B | 2.6% |
|
27 |
+
| starcoder.fim | 67.6B | 0.5x | 33.8B | 2.6% |
|
28 |
+
| redpajama.stackexchange | 61.1B | 1x | 61.1B | 4.7% |
|
29 |
+
| starcoder | 132.6B | 0.5x | 66.3B | 5.1% |
|
30 |
+
| pile-of-law | 76.7B | 1x | 76.7B | 5.9% |
|
31 |
+
| redpajama.book | 80.6B | 1x | 80.6B | 6.2% |
|
32 |
+
| s2orc | 107.9B | 1x | 107.9B | 8.3% |
|
33 |
+
| redpajama.wikipedia | 22.1B | 6x | 132.6B | 10.2% |
|
34 |
+
| refinedweb | 612.3B | 1x | 612.3B | 47.1% |
|
|
|
35 |
|
36 |
## First 10 Checkpoints
|
37 |
| Checkpoints | |
|