Mediocre-Judge
/

bengali_qa_model_AGGRO_V2

+---
+library_name: transformers
+license: mit
+base_model: sagorsarker/bangla-bert-base
+tags:
+- generated_from_trainer
+model-index:
+- name: bengali_qa_model_AGGRO_V2
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# bengali_qa_model_AGGRO_V2
+This model is a fine-tuned version of [sagorsarker/bangla-bert-base](https://huggingface.co/sagorsarker/bangla-bert-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1885
+- Exact Match: 96.0
+- F1 Score: 96.3051
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 3407
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 64
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- training_steps: 50
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Exact Match | F1 Score |
+|:-------------:|:------:|:----:|:---------------:|:-----------:|:--------:|
+| 6.0684        | 0.0053 | 1    | 6.0629          | 0.0         | 6.2927   |
+| 6.033         | 0.0107 | 2    | 5.9761          | 0.0         | 6.7935   |
+| 6.0144        | 0.0160 | 3    | 5.8037          | 0.0         | 9.7900   |
+| 5.8029        | 0.0214 | 4    | 5.5486          | 0.5263      | 19.3074  |
+| 5.6831        | 0.0267 | 5    | 5.2126          | 2.2556      | 37.8180  |
+| 5.26          | 0.0321 | 6    | 4.7970          | 5.9398      | 49.8764  |
+| 4.8899        | 0.0374 | 7    | 4.3855          | 9.3233      | 55.2670  |
+| 4.5683        | 0.0428 | 8    | 3.9798          | 15.3383     | 59.7750  |
+| 4.0571        | 0.0481 | 9    | 3.5837          | 22.6316     | 63.9729  |
+| 3.6658        | 0.0535 | 10   | 3.2052          | 28.7218     | 66.1381  |
+| 3.3842        | 0.0588 | 11   | 2.8517          | 33.6842     | 68.2625  |
+| 3.0377        | 0.0641 | 12   | 2.5296          | 38.3459     | 69.6544  |
+| 2.933         | 0.0695 | 13   | 2.2425          | 42.1053     | 70.3538  |
+| 2.383         | 0.0748 | 14   | 1.9875          | 45.9398     | 71.7662  |
+| 2.12          | 0.0802 | 15   | 1.7636          | 50.1504     | 73.3768  |
+| 1.7072        | 0.0855 | 16   | 1.5667          | 55.4887     | 75.4763  |
+| 1.7314        | 0.0909 | 17   | 1.3929          | 59.8496     | 77.6552  |
+| 1.4855        | 0.0962 | 18   | 1.2390          | 64.0602     | 80.1659  |
+| 1.4605        | 0.1016 | 19   | 1.1030          | 68.2707     | 82.0848  |
+| 1.4278        | 0.1069 | 20   | 0.9825          | 72.4060     | 84.1071  |
+| 1.1391        | 0.1123 | 21   | 0.8741          | 76.1654     | 85.9345  |
+| 1.2315        | 0.1176 | 22   | 0.7780          | 79.0977     | 87.2864  |
+| 0.9215        | 0.1230 | 23   | 0.6933          | 81.6541     | 88.3887  |
+| 0.7547        | 0.1283 | 24   | 0.6182          | 83.5338     | 89.2823  |
+| 0.717         | 0.1336 | 25   | 0.5517          | 86.2406     | 90.9047  |
+| 1.0054        | 0.1390 | 26   | 0.4950          | 88.1203     | 91.8787  |
+| 0.5741        | 0.1443 | 27   | 0.4465          | 89.3233     | 92.5173  |
+| 0.6248        | 0.1497 | 28   | 0.4053          | 90.3008     | 92.8381  |
+| 0.4378        | 0.1550 | 29   | 0.3709          | 91.2782     | 93.3403  |
+| 0.3546        | 0.1604 | 30   | 0.3421          | 92.2556     | 93.8510  |
+| 0.542         | 0.1657 | 31   | 0.3188          | 92.8571     | 94.1842  |
+| 0.2279        | 0.1711 | 32   | 0.2997          | 93.4586     | 94.3692  |
+| 0.1765        | 0.1764 | 33   | 0.2843          | 93.8346     | 94.5317  |
+| 0.256         | 0.1818 | 34   | 0.2721          | 94.2105     | 94.8025  |
+| 0.2041        | 0.1871 | 35   | 0.2623          | 94.2857     | 94.8878  |
+| 0.292         | 0.1924 | 36   | 0.2545          | 94.4361     | 94.9898  |
+| 0.2241        | 0.1978 | 37   | 0.2484          | 94.7368     | 95.2014  |
+| 0.5822        | 0.2031 | 38   | 0.2434          | 94.8120     | 95.3437  |
+| 0.3077        | 0.2085 | 39   | 0.2394          | 94.9624     | 95.4440  |
+| 0.3954        | 0.2138 | 40   | 0.2362          | 94.9624     | 95.4440  |
+| 0.3814        | 0.2192 | 41   | 0.2337          | 94.9624     | 95.4440  |
+| 0.1033        | 0.2245 | 42   | 0.2317          | 94.9624     | 95.4440  |
+### Framework versions
+- Transformers 4.46.3
+- Pytorch 2.4.0
+- Datasets 3.1.0
+- Tokenizers 0.20.3

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1192ac60fba53f96014edfa979163312e9e0cddc9a8db58a1bcbd48cc43ccbfd
 size 655253320

 version https://git-lfs.github.com/spec/v1
+oid sha256:b2e4eabbbfeb4b34e181f275aaee436bbda92033cc9305335949a7b27e912e9d
 size 655253320