Update README.md
Browse files
README.md
CHANGED
@@ -9,9 +9,9 @@ model-index:
|
|
9 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
10 |
should probably proofread and complete it, then remove this comment. -->
|
11 |
|
12 |
-
#
|
13 |
|
14 |
-
This model is a
|
15 |
|
16 |
## Model description
|
17 |
|
@@ -26,6 +26,7 @@ More information needed
|
|
26 |
More information needed
|
27 |
|
28 |
## Training procedure
|
|
|
29 |
|
30 |
### Training hyperparameters
|
31 |
|
|
|
9 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
10 |
should probably proofread and complete it, then remove this comment. -->
|
11 |
|
12 |
+
# TinyStories-GPT2-3M
|
13 |
|
14 |
+
This model is a tiny (3M trainable parameters) GPT-2 model pre-trained for 3 epochs on the [TinyStories](https://huggingface.co/datasets/roneneldan/TinyStories) V2 dataset.
|
15 |
|
16 |
## Model description
|
17 |
|
|
|
26 |
More information needed
|
27 |
|
28 |
## Training procedure
|
29 |
+
Trained for 400k steps (~7 hours) on 2xH100 80GB PCIe with 32vCPU and 500GB RAM on Runpod.
|
30 |
|
31 |
### Training hyperparameters
|
32 |
|