sjrhuschlee
commited on
Commit
•
6249d84
1
Parent(s):
a0a2cc0
Update README.md
Browse files
README.md
CHANGED
@@ -80,4 +80,25 @@ answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
|
|
80 |
|
81 |
### Training hyperparameters
|
82 |
|
83 |
-
The following hyperparameters were used during training:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
80 |
|
81 |
### Training hyperparameters
|
82 |
|
83 |
+
The following hyperparameters were used during training:
|
84 |
+
- learning_rate: 2e-05
|
85 |
+
- train_batch_size: 16
|
86 |
+
- eval_batch_size: 8
|
87 |
+
- seed: 42
|
88 |
+
- gradient_accumulation_steps: 6
|
89 |
+
- total_train_batch_size: 96
|
90 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
91 |
+
- lr_scheduler_type: linear
|
92 |
+
- lr_scheduler_warmup_ratio: 0.1
|
93 |
+
- num_epochs: 4.0
|
94 |
+
|
95 |
+
### Training results
|
96 |
+
|
97 |
+
|
98 |
+
|
99 |
+
### Framework versions
|
100 |
+
|
101 |
+
- Transformers 4.30.0.dev0
|
102 |
+
- Pytorch 2.0.1+cu117
|
103 |
+
- Datasets 2.12.0
|
104 |
+
- Tokenizers 0.13.3
|