Edit model card

qa-persian-mdeberta-v3-base-squad2

This model is a fine-tuned version of makhataei/qa-persian-mdeberta-v3-base-squad2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.1273

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-07
  • train_batch_size: 14
  • eval_batch_size: 14
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
4.8591 1.0 17 5.1273
4.9328 2.0 34 5.1273
4.9113 3.0 51 5.1273
4.8829 4.0 68 5.1273
4.9652 5.0 85 5.1273
4.8964 6.0 102 5.1273
4.8316 7.0 119 5.1273
4.8343 8.0 136 5.1273
4.8819 9.0 153 5.1273
4.8986 10.0 170 5.1273
4.9171 11.0 187 5.1273
4.9171 12.0 204 5.1273
4.8551 13.0 221 5.1273
4.8842 14.0 238 5.1273
4.8491 15.0 255 5.1273
4.8761 16.0 272 5.1273
4.8561 17.0 289 5.1273
4.8983 18.0 306 5.1273
4.844 19.0 323 5.1273
4.8985 20.0 340 5.1273
4.8629 21.0 357 5.1273
4.9092 22.0 374 5.1273
4.8956 23.0 391 5.1273
4.8141 24.0 408 5.1273
4.9507 25.0 425 5.1273
4.9157 26.0 442 5.1273
4.8573 27.0 459 5.1273
4.8307 28.0 476 5.1273
4.8523 29.0 493 5.1273
4.8635 30.0 510 5.1273
4.9477 31.0 527 5.1273
4.9748 32.0 544 5.1273
4.919 33.0 561 5.1273
4.9646 34.0 578 5.1273
4.906 35.0 595 5.1273
4.8959 36.0 612 5.1273
4.9289 37.0 629 5.1273
4.9716 38.0 646 5.1273
4.9264 39.0 663 5.1273
4.9511 40.0 680 5.1273
4.9559 41.0 697 5.1273
4.8925 42.0 714 5.1273
5.0009 43.0 731 5.1273
5.0171 44.0 748 5.1273
4.9922 45.0 765 5.1273
4.9442 46.0 782 5.1273
4.9718 47.0 799 5.1273
4.9723 48.0 816 5.1273
4.9641 49.0 833 5.1273
4.9623 50.0 850 5.1273
4.9661 51.0 867 5.1273
4.9751 52.0 884 5.1273
5.0271 53.0 901 5.1273
5.0186 54.0 918 5.1273
4.9842 55.0 935 5.1273
4.9846 56.0 952 5.1273
5.0147 57.0 969 5.1273
5.0593 58.0 986 5.1273
4.9787 59.0 1003 5.1273
5.0569 60.0 1020 5.1273
5.0237 61.0 1037 5.1273
4.9641 62.0 1054 5.1273
5.048 63.0 1071 5.1273
4.9656 64.0 1088 5.1273
5.0632 65.0 1105 5.1273
5.0678 66.0 1122 5.1273
5.0426 67.0 1139 5.1273
5.0061 68.0 1156 5.1273
4.9887 69.0 1173 5.1273
5.0531 70.0 1190 5.1273
5.1073 71.0 1207 5.1273
5.016 72.0 1224 5.1273
5.0383 73.0 1241 5.1273
5.032 74.0 1258 5.1273
5.0459 75.0 1275 5.1273
5.0734 76.0 1292 5.1273
5.0059 77.0 1309 5.1273
5.027 78.0 1326 5.1273
5.0383 79.0 1343 5.1273
5.104 80.0 1360 5.1273
5.0209 81.0 1377 5.1273
5.0443 82.0 1394 5.1273
4.9923 83.0 1411 5.1273
5.0462 84.0 1428 5.1273
5.0416 85.0 1445 5.1273
5.0593 86.0 1462 5.1273
5.1125 87.0 1479 5.1273
5.0125 88.0 1496 5.1273
5.0925 89.0 1513 5.1273
5.0681 90.0 1530 5.1273
5.0962 91.0 1547 5.1273
5.0843 92.0 1564 5.1273
5.0987 93.0 1581 5.1273
5.0251 94.0 1598 5.1273
5.0904 95.0 1615 5.1273
5.1356 96.0 1632 5.1273
5.1103 97.0 1649 5.1273
5.0244 98.0 1666 5.1273
5.0914 99.0 1683 5.1273
5.1083 100.0 1700 5.1273

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
278M params
Tensor type
F32

Finetuned from