update model card README.md
Browse files
README.md
CHANGED
@@ -71,9 +71,11 @@ More information needed
|
|
71 |
|
72 |
The following hyperparameters were used during training:
|
73 |
- learning_rate: 0.0007
|
74 |
-
- train_batch_size:
|
75 |
-
- eval_batch_size:
|
76 |
- seed: 42
|
|
|
|
|
77 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
78 |
- lr_scheduler_type: linear
|
79 |
- lr_scheduler_warmup_ratio: 0.01
|
@@ -177,4 +179,4 @@ The following hyperparameters were used during training:
|
|
177 |
'weight_decay': 0.1}}
|
178 |
|
179 |
# Wandb URL:
|
180 |
-
https://wandb.ai/kejian/uncategorized/runs/
|
|
|
71 |
|
72 |
The following hyperparameters were used during training:
|
73 |
- learning_rate: 0.0007
|
74 |
+
- train_batch_size: 32
|
75 |
+
- eval_batch_size: 16
|
76 |
- seed: 42
|
77 |
+
- gradient_accumulation_steps: 2
|
78 |
+
- total_train_batch_size: 64
|
79 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
80 |
- lr_scheduler_type: linear
|
81 |
- lr_scheduler_warmup_ratio: 0.01
|
|
|
179 |
'weight_decay': 0.1}}
|
180 |
|
181 |
# Wandb URL:
|
182 |
+
https://wandb.ai/kejian/uncategorized/runs/1efu1obk
|