fine_tuned_xlm-roberta_3April
This model is a fine-tuned version of ancs21/xlm-roberta-large-vi-qa on the None dataset. It achieves the following results on the evaluation set:
- Best F1: 76.1698
- Loss: 3.3902
- Exact: 39.0997
- F1: 56.5332
- Total: 3821
- Hasans Exact: 56.1628
- Hasans F1: 81.2715
- Hasans Total: 2653
- Noans Exact: 0.3425
- Noans F1: 0.3425
- Noans Total: 1168
- Best Exact: 60.8479
- Best Exact Thresh: 0.5169
- Best F1 Thresh: 0.6480
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Best F1 | Validation Loss | Exact | F1 | Total | Hasans Exact | Hasans F1 | Hasans Total | Noans Exact | Noans F1 | Noans Total | Best Exact | Best Exact Thresh | Best F1 Thresh |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.6591 | 0.24 | 1000 | 69.8185 | 1.2645 | 36.5611 | 54.8584 | 3821 | 52.6574 | 79.0101 | 2653 | 0.0 | 0.0 | 1168 | 55.2473 | 0.8809 | 0.9575 |
0.576 | 0.47 | 2000 | 72.9418 | 1.2144 | 37.6865 | 55.8902 | 3821 | 54.2405 | 80.4585 | 2653 | 0.0856 | 0.0856 | 1168 | 57.2363 | 0.9300 | 0.9414 |
0.5355 | 0.71 | 3000 | 72.9946 | 1.1680 | 39.0474 | 55.7491 | 3821 | 56.2382 | 80.2930 | 2653 | 0.0 | 0.0 | 1168 | 58.9375 | 0.9023 | 0.9418 |
0.558 | 0.95 | 4000 | 71.7133 | 1.2049 | 38.2361 | 55.4918 | 3821 | 54.7305 | 79.5832 | 2653 | 0.7705 | 0.7705 | 1168 | 57.2887 | 0.8935 | 0.9786 |
0.4136 | 1.18 | 5000 | 71.3382 | 1.2880 | 38.7071 | 55.5727 | 3821 | 55.6351 | 79.9258 | 2653 | 0.2568 | 0.2568 | 1168 | 57.3934 | 0.6526 | 0.8172 |
0.4046 | 1.42 | 6000 | 73.7959 | 1.1542 | 38.4978 | 55.9133 | 3821 | 55.1451 | 80.2280 | 2653 | 0.6849 | 0.6849 | 1168 | 58.8851 | 0.8676 | 0.9130 |
0.3991 | 1.66 | 7000 | 73.8805 | 1.1187 | 38.9950 | 56.2014 | 3821 | 56.1628 | 80.9445 | 2653 | 0.0 | 0.0 | 1168 | 59.5394 | 0.8720 | 0.9676 |
0.4062 | 1.9 | 8000 | 74.6655 | 1.0558 | 38.3146 | 55.9064 | 3821 | 55.1451 | 80.4818 | 2653 | 0.0856 | 0.0856 | 1168 | 59.4609 | 0.7752 | 0.8960 |
0.3246 | 2.13 | 9000 | 74.3864 | 1.3330 | 39.5708 | 56.4795 | 3821 | 56.6528 | 81.0058 | 2653 | 0.7705 | 0.7705 | 1168 | 59.9058 | 0.7984 | 0.9832 |
0.3016 | 2.37 | 10000 | 73.8462 | 1.3389 | 39.0212 | 56.2669 | 3821 | 56.2005 | 81.0388 | 2653 | 0.0 | 0.0 | 1168 | 59.1468 | 0.7780 | 0.8369 |
0.297 | 2.61 | 11000 | 75.1653 | 1.3418 | 39.6493 | 56.8063 | 3821 | 57.1052 | 81.8156 | 2653 | 0.0 | 0.0 | 1168 | 59.9320 | 0.8351 | 0.9593 |
0.291 | 2.84 | 12000 | 74.9741 | 1.3439 | 39.0997 | 56.5274 | 3821 | 56.3136 | 81.4139 | 2653 | 0.0 | 0.0 | 1168 | 59.3562 | 0.9742 | 0.9970 |
0.251 | 3.08 | 13000 | 75.3242 | 1.7083 | 39.5446 | 56.7699 | 3821 | 56.3890 | 81.1978 | 2653 | 1.2842 | 1.2842 | 1168 | 60.5339 | 0.9559 | 0.9798 |
0.2022 | 3.32 | 14000 | 75.0498 | 1.5213 | 38.7857 | 56.1680 | 3821 | 55.8613 | 80.8963 | 2653 | 0.0 | 0.0 | 1168 | 59.8273 | 0.9108 | 0.9803 |
0.2129 | 3.55 | 15000 | 75.1471 | 1.5169 | 38.7857 | 56.1669 | 3821 | 55.8236 | 80.8570 | 2653 | 0.0856 | 0.0856 | 1168 | 59.3300 | 0.9468 | 0.9995 |
0.2071 | 3.79 | 16000 | 74.6861 | 1.4170 | 39.5185 | 56.7831 | 3821 | 56.8790 | 81.7445 | 2653 | 0.0856 | 0.0856 | 1168 | 60.0105 | 0.7001 | 0.8805 |
0.2052 | 4.03 | 17000 | 75.7601 | 1.9237 | 39.4661 | 56.6037 | 3821 | 56.8036 | 81.4862 | 2653 | 0.0856 | 0.0856 | 1168 | 60.5339 | 0.5681 | 0.9653 |
0.1407 | 4.26 | 18000 | 75.5813 | 1.8430 | 38.3407 | 56.0982 | 3821 | 55.2205 | 80.7957 | 2653 | 0.0 | 0.0 | 1168 | 60.0105 | 0.5421 | 0.9222 |
0.153 | 4.5 | 19000 | 75.1837 | 1.7648 | 38.4716 | 56.0839 | 3821 | 55.3713 | 80.7376 | 2653 | 0.0856 | 0.0856 | 1168 | 60.0628 | 0.6387 | 0.9745 |
0.1538 | 4.74 | 20000 | 75.0762 | 1.5864 | 37.8697 | 55.5910 | 3821 | 54.5043 | 80.0276 | 2653 | 0.0856 | 0.0856 | 1168 | 59.4609 | 0.6394 | 0.9597 |
0.1569 | 4.97 | 21000 | 74.7077 | 1.6717 | 38.1837 | 55.8677 | 3821 | 54.9943 | 80.4638 | 2653 | 0.0 | 0.0 | 1168 | 58.8589 | 0.5886 | 0.8916 |
0.1091 | 5.21 | 22000 | 75.6475 | 2.0903 | 38.6548 | 56.2204 | 3821 | 55.6351 | 80.9341 | 2653 | 0.0856 | 0.0856 | 1168 | 60.1152 | 0.6634 | 0.8348 |
0.1107 | 5.45 | 23000 | 75.3914 | 2.2899 | 39.2829 | 56.6296 | 3821 | 56.2382 | 81.2220 | 2653 | 0.7705 | 0.7705 | 1168 | 59.6964 | 0.5812 | 0.9192 |
0.1147 | 5.69 | 24000 | 75.9043 | 1.9436 | 38.9950 | 56.4320 | 3821 | 56.1251 | 81.2388 | 2653 | 0.0856 | 0.0856 | 1168 | 60.5077 | 0.7035 | 0.9620 |
0.1097 | 5.92 | 25000 | 76.0467 | 1.8455 | 38.3669 | 55.8887 | 3821 | 55.2582 | 80.4940 | 2653 | 0.0 | 0.0 | 1168 | 60.3769 | 0.7807 | 0.9827 |
0.0844 | 6.16 | 26000 | 75.6935 | 2.5688 | 38.6810 | 56.3070 | 3821 | 55.6728 | 81.0588 | 2653 | 0.0856 | 0.0856 | 1168 | 59.8273 | 0.6209 | 0.9985 |
0.0774 | 6.4 | 27000 | 75.6526 | 2.5920 | 38.5763 | 56.1577 | 3821 | 55.5597 | 80.8815 | 2653 | 0.0 | 0.0 | 1168 | 60.0628 | 0.5364 | 0.9746 |
0.08 | 6.63 | 28000 | 75.5989 | 2.6070 | 38.9427 | 56.6394 | 3821 | 56.0498 | 81.5375 | 2653 | 0.0856 | 0.0856 | 1168 | 59.9581 | 0.5102 | 0.8782 |
0.0791 | 6.87 | 29000 | 75.9065 | 2.4939 | 38.6810 | 56.7529 | 3821 | 55.5974 | 81.6256 | 2653 | 0.2568 | 0.2568 | 1168 | 60.0890 | 0.7359 | 0.9913 |
0.0667 | 7.11 | 30000 | 75.9315 | 2.8027 | 38.6810 | 56.6293 | 3821 | 55.5974 | 81.4476 | 2653 | 0.2568 | 0.2568 | 1168 | 59.8011 | 0.6325 | 0.9909 |
0.0504 | 7.34 | 31000 | 75.4506 | 2.9797 | 39.0735 | 56.6964 | 3821 | 56.2382 | 81.6196 | 2653 | 0.0856 | 0.0856 | 1168 | 60.1413 | 0.5311 | 0.9996 |
0.0485 | 7.58 | 32000 | 75.7607 | 2.8116 | 38.7333 | 56.4275 | 3821 | 55.7105 | 81.1947 | 2653 | 0.1712 | 0.1712 | 1168 | 60.1413 | 0.5366 | 0.9183 |
0.0487 | 7.82 | 33000 | 75.9794 | 2.9559 | 39.1259 | 56.5935 | 3821 | 56.2005 | 81.3584 | 2653 | 0.3425 | 0.3425 | 1168 | 60.6386 | 0.4966 | 0.9163 |
0.0496 | 8.05 | 34000 | 75.8458 | 2.9892 | 39.2044 | 56.5961 | 3821 | 56.2382 | 81.2868 | 2653 | 0.5137 | 0.5137 | 1168 | 60.4292 | 0.4952 | 0.9726 |
0.0378 | 8.29 | 35000 | 75.5534 | 3.0456 | 39.0212 | 56.4687 | 3821 | 55.9744 | 81.1033 | 2653 | 0.5137 | 0.5137 | 1168 | 60.1152 | 0.4984 | 0.9850 |
0.0322 | 8.53 | 36000 | 75.5716 | 3.2717 | 38.9689 | 56.4812 | 3821 | 55.9367 | 81.1589 | 2653 | 0.4281 | 0.4281 | 1168 | 60.2460 | 0.5405 | 0.5620 |
0.0333 | 8.77 | 37000 | 76.1348 | 3.2260 | 39.0997 | 56.6567 | 3821 | 56.0121 | 81.2986 | 2653 | 0.6849 | 0.6849 | 1168 | 60.5601 | 0.6449 | 0.9261 |
0.0298 | 9.0 | 38000 | 76.4711 | 3.1900 | 39.3353 | 56.7593 | 3821 | 56.5021 | 81.5972 | 2653 | 0.3425 | 0.3425 | 1168 | 60.9526 | 0.5191 | 0.9665 |
0.0181 | 9.24 | 39000 | 76.4433 | 3.4204 | 39.3353 | 56.7658 | 3821 | 56.4644 | 81.5688 | 2653 | 0.4281 | 0.4281 | 1168 | 61.1882 | 0.5781 | 0.9364 |
0.0188 | 9.48 | 40000 | 75.8578 | 3.4566 | 39.2567 | 56.5759 | 3821 | 56.3890 | 81.3330 | 2653 | 0.3425 | 0.3425 | 1168 | 60.6909 | 0.4253 | 0.7260 |
0.0226 | 9.71 | 41000 | 76.2686 | 3.3418 | 39.4138 | 56.6183 | 3821 | 56.6528 | 81.4317 | 2653 | 0.2568 | 0.2568 | 1168 | 61.1882 | 0.5242 | 0.9950 |
0.019 | 9.95 | 42000 | 76.1698 | 3.3902 | 39.0997 | 56.5332 | 3821 | 56.1628 | 81.2715 | 2653 | 0.3425 | 0.3425 | 1168 | 60.8479 | 0.5169 | 0.6480 |
Framework versions
- Transformers 4.37.2
- Pytorch 1.13.1+cu117
- Datasets 2.16.1
- Tokenizers 0.15.2
- Downloads last month
- 7
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
Model tree for Kudod/fine_tuned_xlm-roberta_3April
Base model
ancs21/xlm-roberta-large-vi-qa