PowerInfer
/

Bamboo-base-v0_1

Feature Extraction

Model card Files Files and versions Community

yixinsong commited on Mar 25

Commit

f7a2a30

•

1 Parent(s): d3735d4

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -44,7 +44,7 @@ The following table shows the hyper-paramters we used in our training process.
 | Batch Size            | 4M          |
 | Weight Decay          | 0.1         |
-**Second phase**: We further adjusted the training corpus ratio, incorporating more domain-specific datasets(Math, Coding), and continued training for 50B tokens.
 | Hyper-parameters      |             |
 | --------------------- | ----------- |

 | Batch Size            | 4M          |
 | Weight Decay          | 0.1         |
+**Second phase**: We further adjusted the training corpus ratio, incorporating more domain-specific datasets (e.g., Math, Coding), and continued training for 50B tokens.
 | Hyper-parameters      |             |
 | --------------------- | ----------- |