qnguyen3 commited on
Commit
c1f99af
1 Parent(s): 0144ae4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -38,8 +38,22 @@ widget:
38
  - text: In the context of computer programming, an algorithm is
39
  example_title: Algorithm Definition
40
  ---
 
 
 
 
 
41
 
 
42
 
 
 
 
 
 
 
 
 
43
 
44
  | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
45
  |-------------|-------|------|-----:|--------|-----:|---|-----:|
 
38
  - text: In the context of computer programming, an algorithm is
39
  example_title: Algorithm Definition
40
  ---
41
+ # Mixsmol-4x400M-v0.1
42
+ This is the first checkpoint (Epoch 1) of Mixsmol-4x400M-v0.1
43
+ Note that this is an experimental in data mixing. Therefore, we only trained the model on 50B tokens (95% English and 5% Vietnamese) to test the following:
44
+ - Reasoining capabilities through high-quality synthetic textbooks data pretraining
45
+ - Crosslingual understanding through machine translation and multilingual + multiple tasks pretraining
46
 
47
+ After verifying our hypothesis with this run, we will schedule a second run on bigger data and compute for it to achieve its maximum capability.
48
 
49
+ ## Data
50
+ - Synthetic Textbooks: 8M samples
51
+ - RefinedWeb: 1M samples
52
+ - RedPajama-v2: 500K samples
53
+ - MathPile: Everything
54
+ - ThePile: MiniPile Subset
55
+ - GoodWiki
56
+ - Instruction Pretraining: 250k samples
57
 
58
  | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
59
  |-------------|-------|------|-----:|--------|-----:|---|-----:|