Update README.md
Browse files
README.md
CHANGED
@@ -172,14 +172,19 @@ mask_filler("đúng nhận sai <mask>", top_k=10)
|
|
172 |
|
173 |
## Fine-tune Configuration
|
174 |
We fine-tune `5CD-AI/viso-twhin-bert-large` on 4 downstream tasks with `transformer` library with the following configuration:
|
|
|
175 |
- seed: 42
|
176 |
-
- gradient_accumulation_steps:
|
177 |
- weight_decay: 0.01
|
178 |
- optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
|
|
|
179 |
- training_epochs: 30
|
180 |
- model_max_length: 128
|
181 |
- learning_rate: 1e-5
|
|
|
|
|
|
|
182 |
And different additional configurations for each task:
|
183 |
| Emotion Recognition | Hate Speech Detection | Spam Reviews Detection | Hate Speech Spans Detection |
|
184 |
| --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
|
185 |
-
|\-
|
|
|
172 |
|
173 |
## Fine-tune Configuration
|
174 |
We fine-tune `5CD-AI/viso-twhin-bert-large` on 4 downstream tasks with `transformer` library with the following configuration:
|
175 |
+
- train_batch_size: 16
|
176 |
- seed: 42
|
177 |
+
- gradient_accumulation_steps: 4
|
178 |
- weight_decay: 0.01
|
179 |
- optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
|
180 |
+
- lr_scheduler_type: cosine
|
181 |
- training_epochs: 30
|
182 |
- model_max_length: 128
|
183 |
- learning_rate: 1e-5
|
184 |
+
- metric_for_best_model: wf1
|
185 |
+
- strategy: epoch
|
186 |
+
|
187 |
And different additional configurations for each task:
|
188 |
| Emotion Recognition | Hate Speech Detection | Spam Reviews Detection | Hate Speech Spans Detection |
|
189 |
| --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
|
190 |
+
|\- learning_rate: 1e-5| \- learning_rate: 5e-6 | \- learning_rate: 1e-5 | \- learning_rate: 5e-6 |
|