LH-Tech-AI
/

Crest-20M-Base

Text Generation

Model card Files Files and versions

LH-Tech-AI commited on 14 days ago

Commit

a131673

·

verified ·

1 Parent(s): e8463a9

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -34,7 +34,7 @@ We used the first 100 million tokens of the 10BT Sample of Fineweb-Edu to train
 - Batch Size: 32
 - Gradient Accumulation Steps: 4
 - Compile model: False
-- Device Type: float16 - CUDA on Kaggle T4 16GB GPU
 ## Training code
 As in all of our models, you can find the full training code in this repo in the files `train.py`, `model.py`, `configurator.py` and `prepare.py`.

 - Batch Size: 32
 - Gradient Accumulation Steps: 4
 - Compile model: False
+- Device Type: float16 - CUDA on Kaggle T4 16GB GPU (train time: ~71min)
 ## Training code
 As in all of our models, you can find the full training code in this repo in the files `train.py`, `model.py`, `configurator.py` and `prepare.py`.