Public100_1L_BERT_5epoch_again
This model is a fine-tuned version of Youssef320/Public100_1L_BERT_5epoch on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.4088
- Top 1 Macro F1 Score: 0.1019
- Top 1 Weighted F1score: 0.1606
- Top 3 Macro F1 Score: 0.2198
- Top3 3 Weighted F1 Score : 0.3153
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- gradient_accumulation_steps: 32
- total_train_batch_size: 2048
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- num_epochs: 4.0
Training results
Training Loss | Epoch | Step | Validation Loss | Top 1 Macro F1 Score | Top 1 Weighted F1score | Top 3 Macro F1 Score | Top3 3 Weighted F1 Score |
---|---|---|---|---|---|---|---|
3.5163 | 0.12 | 64 | 3.5267 | 0.0688 | 0.1259 | 0.1655 | 0.2710 |
3.4858 | 0.25 | 128 | 3.5224 | 0.0692 | 0.1260 | 0.1700 | 0.2728 |
3.444 | 0.38 | 192 | 3.5111 | 0.0751 | 0.1334 | 0.1730 | 0.2787 |
3.4475 | 0.5 | 256 | 3.5042 | 0.0746 | 0.1317 | 0.1738 | 0.2790 |
3.4461 | 0.62 | 320 | 3.4986 | 0.0750 | 0.1320 | 0.1731 | 0.2777 |
3.4652 | 0.75 | 384 | 3.4898 | 0.0781 | 0.1373 | 0.1796 | 0.2849 |
3.4443 | 0.88 | 448 | 3.4867 | 0.0802 | 0.1380 | 0.1811 | 0.2852 |
3.4828 | 1.0 | 512 | 3.4726 | 0.0797 | 0.1392 | 0.1836 | 0.2893 |
3.4113 | 1.12 | 576 | 3.4760 | 0.0819 | 0.1409 | 0.1863 | 0.2909 |
3.4054 | 1.25 | 640 | 3.4737 | 0.0822 | 0.1408 | 0.1827 | 0.2881 |
3.4218 | 1.38 | 704 | 3.4678 | 0.0826 | 0.1418 | 0.1861 | 0.2897 |
3.4095 | 1.5 | 768 | 3.4580 | 0.0847 | 0.1436 | 0.1890 | 0.2934 |
3.4153 | 1.62 | 832 | 3.4534 | 0.0858 | 0.1459 | 0.1904 | 0.2959 |
3.4154 | 1.75 | 896 | 3.4468 | 0.0855 | 0.1450 | 0.1921 | 0.2961 |
3.3818 | 1.88 | 960 | 3.4436 | 0.0836 | 0.1430 | 0.1905 | 0.2939 |
3.4033 | 2.0 | 1024 | 3.4368 | 0.0878 | 0.1481 | 0.1960 | 0.2996 |
3.3245 | 2.12 | 1088 | 3.4500 | 0.0894 | 0.1509 | 0.1972 | 0.3019 |
3.2943 | 2.25 | 1152 | 3.4536 | 0.0887 | 0.1485 | 0.1995 | 0.3015 |
3.3332 | 2.38 | 1216 | 3.4468 | 0.0900 | 0.1488 | 0.2005 | 0.3004 |
3.3483 | 2.5 | 1280 | 3.4377 | 0.0924 | 0.1523 | 0.2044 | 0.3035 |
3.3408 | 2.62 | 1344 | 3.4341 | 0.0923 | 0.1519 | 0.2066 | 0.3050 |
3.343 | 2.75 | 1408 | 3.4293 | 0.0928 | 0.1527 | 0.2052 | 0.3054 |
3.3487 | 2.88 | 1472 | 3.4235 | 0.0921 | 0.1525 | 0.2040 | 0.3038 |
3.348 | 3.0 | 1536 | 3.4169 | 0.0956 | 0.1563 | 0.2112 | 0.3108 |
3.2211 | 3.12 | 1600 | 3.4374 | 0.0966 | 0.1569 | 0.2123 | 0.3091 |
3.2275 | 3.25 | 1664 | 3.4398 | 0.0953 | 0.1549 | 0.2068 | 0.3073 |
3.2523 | 3.38 | 1728 | 3.4343 | 0.0967 | 0.1556 | 0.2109 | 0.3076 |
3.2741 | 3.5 | 1792 | 3.4355 | 0.0980 | 0.1561 | 0.2147 | 0.3081 |
3.2815 | 3.62 | 1856 | 3.4259 | 0.0994 | 0.1589 | 0.2160 | 0.3112 |
3.2517 | 3.75 | 1920 | 3.4184 | 0.0984 | 0.1577 | 0.2179 | 0.3116 |
3.2801 | 3.88 | 1984 | 3.4134 | 0.0996 | 0.1592 | 0.2173 | 0.3131 |
3.2925 | 4.0 | 2048 | 3.4088 | 0.1019 | 0.1606 | 0.2198 | 0.3153 |
Framework versions
- Transformers 4.20.1
- Pytorch 1.12.1+cu102
- Datasets 2.0.0
- Tokenizers 0.11.0
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.