--- library_name: transformers license: apache-2.0 base_model: bert-base-uncased tags: - generated_from_trainer model-index: - name: bengali_qa_model_AGGRO_bert_base_uncased results: [] --- # bengali_qa_model_AGGRO_bert_base_uncased This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.1386 - Exact Match: 95.2857 - F1 Score: 96.3846 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 3407 - gradient_accumulation_steps: 16 - total_train_batch_size: 64 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - training_steps: 150 ### Training results | Training Loss | Epoch | Step | Validation Loss | Exact Match | F1 Score | |:-------------:|:------:|:----:|:---------------:|:-----------:|:--------:| | 6.2014 | 0.0053 | 1 | 6.1979 | 0.0 | 7.0506 | | 6.1881 | 0.0107 | 2 | 6.1825 | 0.0 | 7.3879 | | 6.2139 | 0.0160 | 3 | 6.1522 | 0.0 | 8.2521 | | 6.1951 | 0.0214 | 4 | 6.1075 | 0.0 | 10.5119 | | 6.1083 | 0.0267 | 5 | 6.0489 | 0.4511 | 15.3179 | | 6.0835 | 0.0321 | 6 | 5.9767 | 2.1053 | 24.1412 | | 6.0129 | 0.0374 | 7 | 5.8920 | 3.6842 | 32.7230 | | 5.9993 | 0.0428 | 8 | 5.7951 | 6.5414 | 40.7170 | | 5.8684 | 0.0481 | 9 | 5.6866 | 9.9248 | 46.7951 | | 5.7231 | 0.0535 | 10 | 5.5651 | 13.8346 | 51.5365 | | 5.6534 | 0.0588 | 11 | 5.4313 | 18.4211 | 55.1575 | | 5.5657 | 0.0641 | 12 | 5.2856 | 23.2331 | 58.0711 | | 5.5148 | 0.0695 | 13 | 5.1299 | 26.6165 | 60.2489 | | 5.3065 | 0.0748 | 14 | 4.9627 | 29.7744 | 62.0035 | | 5.1358 | 0.0802 | 15 | 4.7850 | 33.7594 | 64.1128 | | 4.9002 | 0.0855 | 16 | 4.5951 | 36.9173 | 66.1340 | | 4.8232 | 0.0909 | 17 | 4.4087 | 39.7744 | 67.8590 | | 4.604 | 0.0962 | 18 | 4.2247 | 43.1579 | 69.4792 | | 4.5291 | 0.1016 | 19 | 4.0460 | 47.1429 | 71.8867 | | 4.3711 | 0.1069 | 20 | 3.8705 | 50.2256 | 73.7469 | | 4.2404 | 0.1123 | 21 | 3.6984 | 52.6316 | 74.5106 | | 4.1769 | 0.1176 | 22 | 3.5303 | 54.9624 | 75.5802 | | 3.8464 | 0.1230 | 23 | 3.3656 | 56.4662 | 76.4567 | | 3.8178 | 0.1283 | 24 | 3.2085 | 57.9699 | 77.2230 | | 3.6047 | 0.1336 | 25 | 3.0549 | 60.3008 | 78.0660 | | 3.4466 | 0.1390 | 26 | 2.9061 | 61.8797 | 78.4774 | | 3.3154 | 0.1443 | 27 | 2.7639 | 63.8346 | 79.6681 | | 3.3505 | 0.1497 | 28 | 2.6242 | 66.1654 | 80.4722 | | 3.0315 | 0.1550 | 29 | 2.4883 | 67.3684 | 81.0899 | | 2.8796 | 0.1604 | 30 | 2.3571 | 68.9474 | 82.0043 | | 2.9183 | 0.1657 | 31 | 2.2325 | 70.8271 | 82.8722 | | 2.4212 | 0.1711 | 32 | 2.1128 | 71.7293 | 83.2822 | | 2.283 | 0.1764 | 33 | 1.9975 | 72.2556 | 83.7327 | | 2.2454 | 0.1818 | 34 | 1.8851 | 73.0075 | 84.1584 | | 2.1467 | 0.1871 | 35 | 1.7746 | 74.2105 | 84.9934 | | 2.1079 | 0.1924 | 36 | 1.6643 | 76.0150 | 86.0725 | | 1.7214 | 0.1978 | 37 | 1.5538 | 77.6692 | 87.2392 | | 2.0057 | 0.2031 | 38 | 1.4412 | 78.1203 | 87.5816 | | 1.8565 | 0.2085 | 39 | 1.3280 | 79.5489 | 88.3833 | | 1.8383 | 0.2138 | 40 | 1.2168 | 80.4511 | 89.0872 | | 1.6629 | 0.2192 | 41 | 1.1118 | 81.0526 | 89.6248 | | 1.3915 | 0.2245 | 42 | 1.0174 | 81.6541 | 89.5453 | | 1.3512 | 0.2299 | 43 | 0.9330 | 82.3308 | 89.6999 | | 1.1958 | 0.2352 | 44 | 0.8608 | 82.4812 | 89.8430 | | 1.0808 | 0.2406 | 45 | 0.8000 | 82.7820 | 89.7950 | | 1.0992 | 0.2459 | 46 | 0.7469 | 83.0075 | 89.8987 | | 0.8846 | 0.2513 | 47 | 0.6994 | 83.3083 | 90.0405 | | 0.9159 | 0.2566 | 48 | 0.6537 | 84.5113 | 90.4492 | | 0.7867 | 0.2619 | 49 | 0.6108 | 85.1128 | 90.7277 | | 0.849 | 0.2673 | 50 | 0.5724 | 86.2406 | 90.9660 | | 0.7173 | 0.2726 | 51 | 0.5379 | 86.7669 | 91.2355 | | 0.8123 | 0.2780 | 52 | 0.5058 | 87.3684 | 91.5345 | | 0.6065 | 0.2833 | 53 | 0.4770 | 87.3684 | 91.4719 | | 0.5135 | 0.2887 | 54 | 0.4495 | 87.6692 | 91.4269 | | 0.809 | 0.2940 | 55 | 0.4236 | 88.3459 | 91.6852 | | 0.5281 | 0.2994 | 56 | 0.3990 | 88.6466 | 91.8734 | | 0.5029 | 0.3047 | 57 | 0.3780 | 89.2481 | 92.0989 | | 0.5069 | 0.3101 | 58 | 0.3593 | 89.6241 | 92.2735 | | 0.4163 | 0.3154 | 59 | 0.3425 | 90.1504 | 92.6192 | | 0.4271 | 0.3207 | 60 | 0.3275 | 90.8271 | 92.8536 | | 0.385 | 0.3261 | 61 | 0.3087 | 91.3534 | 93.0501 | | 0.365 | 0.3314 | 62 | 0.2891 | 91.5038 | 93.1793 | | 0.3785 | 0.3368 | 63 | 0.2729 | 91.8797 | 93.3364 | | 0.1751 | 0.3421 | 64 | 0.2598 | 92.2556 | 93.5858 | | 0.3697 | 0.3475 | 65 | 0.2499 | 92.7068 | 93.9267 | | 0.4384 | 0.3528 | 66 | 0.2413 | 92.9323 | 94.1523 | | 0.2969 | 0.3582 | 67 | 0.2327 | 93.0075 | 94.2276 | | 0.3493 | 0.3635 | 68 | 0.2251 | 93.3083 | 94.4743 | | 0.1457 | 0.3689 | 69 | 0.2182 | 93.4586 | 94.5879 | | 0.6827 | 0.3742 | 70 | 0.2096 | 93.6842 | 94.7584 | | 0.4472 | 0.3796 | 71 | 0.2018 | 93.7594 | 94.8085 | | 0.1831 | 0.3849 | 72 | 0.1945 | 93.6842 | 94.6772 | | 0.2381 | 0.3902 | 73 | 0.1891 | 93.7594 | 94.6731 | | 0.1892 | 0.3956 | 74 | 0.1830 | 93.8346 | 94.7656 | ### Framework versions - Transformers 4.46.3 - Pytorch 2.4.0 - Datasets 3.1.0 - Tokenizers 0.20.3