LH-Tech-AI commited on
Commit
e8463a9
·
verified ·
1 Parent(s): deff1d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -20,7 +20,7 @@ tags:
20
  This is a tiny 20.75M parameter model showing how small models can perform on a little bunch of data.
21
 
22
  ## Training data
23
- We used the first 100 million tokens of the 10BT Sample of Fineweb-Edu to train this model for 5000 steps for a final loss of ~4.0 and a val loss of 4.1566.
24
 
25
  ## Training specs
26
  - Architecture: nanoGPT
 
20
  This is a tiny 20.75M parameter model showing how small models can perform on a little bunch of data.
21
 
22
  ## Training data
23
+ We used the first 100 million tokens of the 10BT Sample of Fineweb-Edu to train this model for 5000 steps for a final loss of 4.2044 and a val loss of 4.1566.
24
 
25
  ## Training specs
26
  - Architecture: nanoGPT