Update README.md
Browse files
README.md
CHANGED
@@ -155,4 +155,20 @@ model_path = "5CD-AI/visobert-14gb-corpus"
|
|
155 |
mask_filler = pipeline("fill-mask", model_path)
|
156 |
|
157 |
mask_filler("ăn nói xà <mask>", top_k=10)
|
158 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
155 |
mask_filler = pipeline("fill-mask", model_path)
|
156 |
|
157 |
mask_filler("ăn nói xà <mask>", top_k=10)
|
158 |
+
```
|
159 |
+
|
160 |
+
## Fine-tune Configuration
|
161 |
+
We fine-tune `5CD-AI/viso-twhin-bert-large` on 4 downstream tasks with `transformer` library with the following configuration:
|
162 |
+
- seed: 42
|
163 |
+
- gradient_accumulation_steps: 1
|
164 |
+
- weight_decay: 0.01
|
165 |
+
- optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
|
166 |
+
- training_epochs: 30
|
167 |
+
- model_max_length: 128
|
168 |
+
- learning_rate: 1e-5
|
169 |
+
|
170 |
+
And different additional configurations for each task:
|
171 |
+
|
172 |
+
| Emotion Recognition | Hate Speech Detection | Spam Reviews Detection | Hate Speech Spans Detection |
|
173 |
+
| --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
|
174 |
+
|\- train_batch_size: 64<br>\- lr_scheduler_type: linear | \- train_batch_size: 32<br>\- lr_scheduler_type: linear | \- train_batch_size: 32<br>\- lr_scheduler_type: cosine | \- train_batch_size: 32<br>\- lr_scheduler_type: cosine |
|