Public100_1L_BERT_10epoch
This model is a fine-tuned version of Youssef320/Public100_1L_BERT_5epoch_again on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.4278
- Top 1 Macro F1 Score: 0.1222
- Top 1 Weighted F1score: 0.1793
- Top 3 Macro F1 Score: 0.2459
- Top3 3 Weighted F1 Score : 0.3330
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 32
- total_train_batch_size: 2048
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- num_epochs: 5.0
Training results
Training Loss | Epoch | Step | Validation Loss | Top 1 Macro F1 Score | Top 1 Weighted F1score | Top 3 Macro F1 Score | Top3 3 Weighted F1 Score |
---|---|---|---|---|---|---|---|
3.125 | 0.12 | 64 | 3.4449 | 0.1028 | 0.1601 | 0.2201 | 0.3131 |
3.1089 | 0.25 | 128 | 3.4470 | 0.1024 | 0.1595 | 0.2178 | 0.3115 |
3.0986 | 0.38 | 192 | 3.4451 | 0.1046 | 0.1625 | 0.2230 | 0.3149 |
3.122 | 0.5 | 256 | 3.4369 | 0.1043 | 0.1619 | 0.2205 | 0.3139 |
3.1461 | 0.62 | 320 | 3.4346 | 0.1044 | 0.1606 | 0.2208 | 0.3121 |
3.158 | 0.75 | 384 | 3.4273 | 0.1039 | 0.1636 | 0.2220 | 0.3174 |
3.1513 | 0.88 | 448 | 3.4357 | 0.1066 | 0.1641 | 0.2230 | 0.3165 |
3.1824 | 1.0 | 512 | 3.4250 | 0.1062 | 0.1649 | 0.2251 | 0.3184 |
3.0692 | 1.12 | 576 | 3.4513 | 0.1084 | 0.1653 | 0.2278 | 0.3181 |
3.0732 | 1.25 | 640 | 3.4512 | 0.1065 | 0.1641 | 0.2226 | 0.3151 |
3.1023 | 1.38 | 704 | 3.4476 | 0.1070 | 0.1644 | 0.2253 | 0.3162 |
3.1039 | 1.5 | 768 | 3.4407 | 0.1060 | 0.1647 | 0.2267 | 0.3177 |
3.1249 | 1.62 | 832 | 3.4351 | 0.1094 | 0.1669 | 0.2262 | 0.3205 |
3.1288 | 1.75 | 896 | 3.4293 | 0.1095 | 0.1660 | 0.2288 | 0.3197 |
3.0974 | 1.88 | 960 | 3.4252 | 0.1066 | 0.1649 | 0.2275 | 0.3184 |
3.13 | 2.0 | 1024 | 3.4203 | 0.1106 | 0.1693 | 0.2295 | 0.3232 |
3.0072 | 2.12 | 1088 | 3.4536 | 0.1104 | 0.1685 | 0.2313 | 0.3217 |
2.992 | 2.25 | 1152 | 3.4639 | 0.1081 | 0.1670 | 0.2304 | 0.3208 |
3.0475 | 2.38 | 1216 | 3.4533 | 0.1112 | 0.1682 | 0.2310 | 0.3202 |
3.0531 | 2.5 | 1280 | 3.4458 | 0.1142 | 0.1702 | 0.2358 | 0.3218 |
3.0568 | 2.62 | 1344 | 3.4435 | 0.1131 | 0.1698 | 0.2354 | 0.3237 |
3.063 | 2.75 | 1408 | 3.4348 | 0.1127 | 0.1704 | 0.2346 | 0.3234 |
3.0819 | 2.88 | 1472 | 3.4336 | 0.1103 | 0.1689 | 0.2320 | 0.3228 |
3.0764 | 3.0 | 1536 | 3.4202 | 0.1145 | 0.1731 | 0.2345 | 0.3276 |
2.928 | 3.12 | 1600 | 3.4660 | 0.1149 | 0.1734 | 0.2378 | 0.3252 |
2.921 | 3.25 | 1664 | 3.4738 | 0.1113 | 0.1702 | 0.2314 | 0.3223 |
2.9601 | 3.38 | 1728 | 3.4644 | 0.1147 | 0.1721 | 0.2385 | 0.3246 |
2.9854 | 3.5 | 1792 | 3.4724 | 0.1167 | 0.1727 | 0.2391 | 0.3235 |
3.0092 | 3.62 | 1856 | 3.4579 | 0.1171 | 0.1737 | 0.2406 | 0.3262 |
2.9751 | 3.75 | 1920 | 3.4515 | 0.1177 | 0.1744 | 0.2413 | 0.3279 |
3.029 | 3.88 | 1984 | 3.4404 | 0.1176 | 0.1748 | 0.2404 | 0.3282 |
3.0906 | 4.0 | 2048 | 3.4211 | 0.1174 | 0.1745 | 0.2420 | 0.3293 |
2.9352 | 4.12 | 2112 | 3.4690 | 0.1173 | 0.1753 | 0.2405 | 0.3280 |
2.9351 | 4.25 | 2176 | 3.4686 | 0.1193 | 0.1745 | 0.2422 | 0.3262 |
2.9849 | 4.38 | 2240 | 3.4582 | 0.1194 | 0.1751 | 0.2433 | 0.3280 |
2.9944 | 4.5 | 2304 | 3.4504 | 0.1191 | 0.1766 | 0.2432 | 0.3302 |
2.963 | 4.62 | 2368 | 3.4430 | 0.1176 | 0.1749 | 0.2413 | 0.3284 |
3.0162 | 4.75 | 2432 | 3.4313 | 0.1226 | 0.1784 | 0.2435 | 0.3305 |
3.007 | 4.88 | 2496 | 3.4341 | 0.1217 | 0.1780 | 0.2477 | 0.3318 |
2.9922 | 5.0 | 2560 | 3.4278 | 0.1222 | 0.1793 | 0.2459 | 0.3330 |
Framework versions
- Transformers 4.20.1
- Pytorch 1.12.1+cu102
- Datasets 2.0.0
- Tokenizers 0.11.0
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support