emozilla commited on
Commit
54fbe61
1 Parent(s): 8dfcf0f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -17,6 +17,6 @@ paricularly in the Science Fiction and Fantasy genres.
17
  The following hyperparameters were used
18
  |Batch Size|Epochs|Context length|Learning rate|Scheduler|Weight decay|Warmup ratio|
19
  |----------|------|--------------|-------------|---------|------------|------------|
20
- | 64 | 3 | 8192 | 2e-5 | Cosine | 0. | 0.03 |
21
 
22
  The model reached a training loss of 2.008 and took approximately 8 hours on 8x A100 80 GB GPUs.
 
17
  The following hyperparameters were used
18
  |Batch Size|Epochs|Context length|Learning rate|Scheduler|Weight decay|Warmup ratio|
19
  |----------|------|--------------|-------------|---------|------------|------------|
20
+ | 128 | 3 | 8192 | 2e-5 | Cosine | 0. | 0.03 |
21
 
22
  The model reached a training loss of 2.008 and took approximately 8 hours on 8x A100 80 GB GPUs.