chaoscodes commited on
Commit
37c43e7
1 Parent(s): 77b87b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -21
README.md CHANGED
@@ -59,25 +59,13 @@ Here we list our data distribution in each stage:
59
 
60
  | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
61
  | ------------- | ----------------- | ------------------------------------------ | -------- |
62
- | RedPajamaBook | 5.4 | 5.4 | 5.4 |
63
- | C4 | 35.0 | 35.0 | 35.0 |
64
- | CommonCrawl | 70.1 | 70.1 | 70.1 |
65
- | Github | 6.5 | 6.5 | 6.5 |
66
- | StackExchange | 4.2 | 4.2 | 4.2 |
67
- | ArXiv | 5.7 | 5.7 | 5.7 |
68
- | Wikipedia | 4.5 | 4.5 | 4.5 |
69
 
70
  ### TinyLlama_v1.1_math_code
71
 
72
  | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
73
  | ------------- | ----------------- | ------------------------------------------ | -------- |
74
- | RedPajamaBook | 5.4 | - | - |
75
- | C4 | 35.0 | 21.6 | 21.6 |
76
- | CommonCrawl | 70.1 | 43.0 | 43.0 |
77
- | Github | 6.5 | - | - |
78
- | StackExchange | 4.2 | 2.6 | 2.6 |
79
- | ArXiv | 5.7 | 5.0 | 5.0 |
80
- | Wikipedia | 4.5 | 2.8 | 2.8 |
81
  | starcoder | - | 15.0 | 15.0 |
82
  | proof_pile | - | 10.0 | 10.0 |
83
 
@@ -85,13 +73,7 @@ Here we list our data distribution in each stage:
85
 
86
  | orpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
87
  | ------------- | ----------------- | ------------------------------------------ | -------- |
88
- | RedPajamaBook | 5.4 | - | - |
89
- | C4 | 35.0 | 14.6 | 14.6 |
90
- | CommonCrawl | 70.1 | 29.3 | 29.3 |
91
- | Github | 6.5 | - | - |
92
- | StackExchange | 4.2 | 1.8 | 1.8 |
93
- | ArXiv | 5.7 | 2.4 | 2.4 |
94
- | Wikipedia | 4.5 | 1.9 | 1.9 |
95
  | skypile | - | 50.0 | 50.0 |
96
 
97
  ### How to use
 
59
 
60
  | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
61
  | ------------- | ----------------- | ------------------------------------------ | -------- |
62
+ | Slimpajama | 100.0 | 100.0 | 100.0 |
 
 
 
 
 
 
63
 
64
  ### TinyLlama_v1.1_math_code
65
 
66
  | Corpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
67
  | ------------- | ----------------- | ------------------------------------------ | -------- |
68
+ | Slimpajama | 100.0 | 75.0 | 75.0 |
 
 
 
 
 
 
69
  | starcoder | - | 15.0 | 15.0 |
70
  | proof_pile | - | 10.0 | 10.0 |
71
 
 
73
 
74
  | orpus | Basic pretraining | Continual pretraining with specific domain | Cooldown |
75
  | ------------- | ----------------- | ------------------------------------------ | -------- |
76
+ | Slimpajama | 100.0 | 50.0 | 50.0 |
 
 
 
 
 
 
77
  | skypile | - | 50.0 | 50.0 |
78
 
79
  ### How to use