LH-Tech-AI
/

Crest-20M-Base

Text Generation

Model card Files Files and versions

LH-Tech-AI commited on 14 days ago

Commit

e8463a9

·

verified ·

1 Parent(s): deff1d6

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ tags:
 This is a tiny 20.75M parameter model showing how small models can perform on a little bunch of data.
 ## Training data
-We used the first 100 million tokens of the 10BT Sample of Fineweb-Edu to train this model for 5000 steps for a final loss of ~4.0 and a val loss of 4.1566.
 ## Training specs
 - Architecture: nanoGPT

 This is a tiny 20.75M parameter model showing how small models can perform on a little bunch of data.
 ## Training data
+We used the first 100 million tokens of the 10BT Sample of Fineweb-Edu to train this model for 5000 steps for a final loss of 4.2044 and a val loss of 4.1566.
 ## Training specs
 - Architecture: nanoGPT