AnirudhRajagopalan1201
commited on
Commit
•
6119bd6
1
Parent(s):
14e94e2
Update README.md
Browse files
README.md
CHANGED
@@ -9,23 +9,14 @@ Model trained on the TinyStories Dataset, replicating https://arxiv.org/abs/2305
|
|
9 |
Hyperparams used to train this model:
|
10 |
```
|
11 |
"batch_size": 64,
|
12 |
-
|
13 |
"block_size": 128,
|
14 |
-
|
15 |
"lr": 6e-4,
|
16 |
-
|
17 |
"num_hidden_layers": 8,
|
18 |
-
|
19 |
"num_attention_heads": 8,
|
20 |
-
|
21 |
"hidden_size": 160,
|
22 |
-
|
23 |
"dropout": 0.1,
|
24 |
-
|
25 |
"weight_decay": 0.01,
|
26 |
-
|
27 |
"epochs": 5,
|
28 |
-
|
29 |
"eval_interval": 200,
|
30 |
"eval_steps": 50,
|
31 |
"vocab_size": 50257,
|
|
|
9 |
Hyperparams used to train this model:
|
10 |
```
|
11 |
"batch_size": 64,
|
|
|
12 |
"block_size": 128,
|
|
|
13 |
"lr": 6e-4,
|
|
|
14 |
"num_hidden_layers": 8,
|
|
|
15 |
"num_attention_heads": 8,
|
|
|
16 |
"hidden_size": 160,
|
|
|
17 |
"dropout": 0.1,
|
|
|
18 |
"weight_decay": 0.01,
|
|
|
19 |
"epochs": 5,
|
|
|
20 |
"eval_interval": 200,
|
21 |
"eval_steps": 50,
|
22 |
"vocab_size": 50257,
|