bengali_qa_model_AGGRO_banglabert

This model is a fine-tuned version of csebuetnlp/banglabert on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2676
  • Exact Match: 98.5714
  • F1 Score: 99.0056

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 3407
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 100

Training results

Training Loss Epoch Step Validation Loss Exact Match F1 Score
6.0126 0.0053 1 5.9783 0.0 0.6103
6.0125 0.0107 2 5.9540 0.0 0.7848
5.9675 0.0160 3 5.9074 0.0 0.9597
5.9287 0.0214 4 5.8425 0.0 1.7507
5.8586 0.0267 5 5.7636 0.1504 4.2535
5.8206 0.0321 6 5.6740 0.4511 11.2628
5.7246 0.0374 7 5.5749 1.7293 23.3816
5.634 0.0428 8 5.4574 3.9850 37.7873
5.4963 0.0481 9 5.3105 5.7895 47.4987
5.2985 0.0535 10 5.1265 7.5940 52.0471
5.182 0.0588 11 4.8997 11.9549 54.5555
4.973 0.0641 12 4.6631 15.9398 56.5530
4.8353 0.0695 13 4.4348 19.6241 58.6313
4.6269 0.0748 14 4.2322 23.5338 60.7029
4.4238 0.0802 15 4.0467 28.0451 62.7494
4.1976 0.0855 16 3.8781 32.7068 64.6375
4.1302 0.0909 17 3.7200 35.7895 66.2513
3.9139 0.0962 18 3.5621 39.5489 67.7758
3.8521 0.1016 19 3.4019 43.0075 69.2899
3.7003 0.1069 20 3.2534 46.0150 70.6373
3.5972 0.1123 21 3.1168 48.8722 72.3043
3.5249 0.1176 22 2.9875 51.5038 73.2903
3.1756 0.1230 23 2.8600 53.6090 74.1609
3.2323 0.1283 24 2.7356 55.2632 74.8864
3.0696 0.1336 25 2.6150 56.8421 75.8938
2.9806 0.1390 26 2.5029 58.9474 77.3831
2.8261 0.1443 27 2.3997 61.1278 78.8467
2.8965 0.1497 28 2.3045 63.9098 80.8890
2.6622 0.1550 29 2.2151 66.0902 82.4263
2.5132 0.1604 30 2.1300 68.0451 83.8984
2.5076 0.1657 31 2.0482 70.9774 85.4846
2.2189 0.1711 32 1.9678 72.7068 86.2628
2.0851 0.1764 33 1.8883 75.8647 87.8992
2.1198 0.1818 34 1.8091 78.5714 89.4148
2.0272 0.1871 35 1.7300 80.8271 90.3877
1.9951 0.1924 36 1.6514 82.7068 91.2138
1.7741 0.1978 37 1.5736 84.9624 91.8920
1.9176 0.2031 38 1.4970 86.5414 92.3250
1.8599 0.2085 39 1.4219 87.5940 92.7578
1.8095 0.2138 40 1.3496 88.5714 93.0980
1.7814 0.2192 41 1.2790 90.0752 93.7737
1.4602 0.2245 42 1.2103 91.5038 94.6447
1.5147 0.2299 43 1.1431 92.1805 95.1039
1.4205 0.2352 44 1.0774 92.9323 95.4111
1.3222 0.2406 45 1.0127 93.9850 96.0199
1.2477 0.2459 46 0.9508 94.8120 96.5219
1.1406 0.2513 47 0.8936 95.2632 96.8391
1.1698 0.2566 48 0.8382 96.3158 97.5331
1.1359 0.2619 49 0.7847 97.0677 97.9841
1.1811 0.2673 50 0.7324 97.5940 98.4006
0.9734 0.2726 51 0.6814 97.7444 98.5321
0.928 0.2780 52 0.6318 97.8947 98.6140
0.8989 0.2833 53 0.5859 98.1203 98.7571
0.7784 0.2887 54 0.5430 98.3459 98.9243
1.0015 0.2940 55 0.5027 98.3459 98.8914
0.7509 0.2994 56 0.4656 98.5714 99.0811
0.6838 0.3047 57 0.4328 98.7970 99.1723
0.7336 0.3101 58 0.4042 98.8722 99.1327
0.5729 0.3154 59 0.3781 98.9474 99.2079
0.5891 0.3207 60 0.3538 99.0226 99.3362
0.6168 0.3261 61 0.3322 99.1729 99.4169
0.5503 0.3314 62 0.3130 99.1729 99.4169
0.5058 0.3368 63 0.2955 99.1729 99.4169
0.4065 0.3421 64 0.2788 99.3233 99.5000
0.4466 0.3475 65 0.2638 99.2481 99.4981
0.4727 0.3528 66 0.2496 99.2481 99.4981
0.45 0.3582 67 0.2365 99.2481 99.4981

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.4.0
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
131
Safetensors
Model size
110M params
Tensor type
F32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Mediocre-Judge/bengali_qa_model_AGGRO_banglabert

Finetuned
(9)
this model