arabic-nano-gpt-v0 / README.md
e-hossam96's picture
End of training
326d518 verified
|
raw
history blame
9.57 kB
metadata
library_name: transformers
license: mit
base_model: openai-community/gpt2
tags:
  - generated_from_trainer
model-index:
  - name: arabic-nano-gpt
    results: []

arabic-nano-gpt

This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2854

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 24

Training results

Training Loss Epoch Step Validation Loss
5.62 0.0585 1000 5.3754
4.6527 0.1170 2000 4.4918
4.2818 0.1755 3000 4.1137
4.1289 0.2340 4000 3.9388
4.0021 0.2924 5000 3.8274
3.9301 0.3509 6000 3.7534
3.8822 0.4094 7000 3.6986
3.8375 0.4679 8000 3.6557
3.7918 0.5264 9000 3.6266
3.7723 0.5849 10000 3.5994
3.7549 0.6434 11000 3.5787
3.7324 0.7019 12000 3.5612
3.7249 0.7604 13000 3.5436
3.6989 0.8188 14000 3.5323
3.7003 0.8773 15000 3.5169
3.6919 0.9358 16000 3.5055
3.6717 0.9943 17000 3.4966
3.6612 1.0528 18000 3.4868
3.6467 1.1113 19000 3.4787
3.6497 1.1698 20000 3.4707
3.6193 1.2283 21000 3.4639
3.6302 1.2868 22000 3.4572
3.6225 1.3452 23000 3.4516
3.635 1.4037 24000 3.4458
3.6115 1.4622 25000 3.4416
3.6162 1.5207 26000 3.4348
3.6142 1.5792 27000 3.4329
3.5956 1.6377 28000 3.4293
3.5885 1.6962 29000 3.4226
3.603 1.7547 30000 3.4195
3.5947 1.8132 31000 3.4142
3.588 1.8716 32000 3.4113
3.5803 1.9301 33000 3.4065
3.5891 1.9886 34000 3.4044
3.5801 2.0471 35000 3.4032
3.5739 2.1056 36000 3.3988
3.5661 2.1641 37000 3.3981
3.5657 2.2226 38000 3.3934
3.5727 2.2811 39000 3.3907
3.5617 2.3396 40000 3.3885
3.5579 2.3980 41000 3.3855
3.5553 2.4565 42000 3.3816
3.5647 2.5150 43000 3.3803
3.5531 2.5735 44000 3.3799
3.5494 2.6320 45000 3.3777
3.5525 2.6905 46000 3.3759
3.5487 2.7490 47000 3.3725
3.5551 2.8075 48000 3.3711
3.5511 2.8660 49000 3.3681
3.5463 2.9244 50000 3.3695
3.5419 2.9829 51000 3.3660
3.5414 3.0414 52000 3.3648
3.5388 3.0999 53000 3.3605
3.5333 3.1584 54000 3.3619
3.525 3.2169 55000 3.3588
3.5361 3.2754 56000 3.3572
3.5302 3.3339 57000 3.3540
3.5355 3.3924 58000 3.3553
3.5391 3.4508 59000 3.3504
3.531 3.5093 60000 3.3495
3.5293 3.5678 61000 3.3483
3.5269 3.6263 62000 3.3489
3.5181 3.6848 63000 3.3494
3.5205 3.7433 64000 3.3480
3.5237 3.8018 65000 3.3440
3.5316 3.8603 66000 3.3417
3.5222 3.9188 67000 3.3433
3.5174 3.9772 68000 3.3418
3.518 4.0357 69000 3.3414
3.5036 4.0942 70000 3.3365
3.5101 4.1527 71000 3.3367
3.5145 4.2112 72000 3.3361
3.5053 4.2697 73000 3.3355
3.5153 4.3282 74000 3.3334
3.5003 4.3867 75000 3.3334
3.5001 4.4452 76000 3.3326
3.5114 4.5036 77000 3.3298
3.5108 4.5621 78000 3.3292
3.4985 4.6206 79000 3.3288
3.497 4.6791 80000 3.3303
3.4982 4.7376 81000 3.3291
3.5068 4.7961 82000 3.3272
3.4915 4.8546 83000 3.3244
3.5036 4.9131 84000 3.3214
3.5027 4.9716 85000 3.3214
3.5078 5.0300 86000 3.3225
3.5112 5.0885 87000 3.3243
3.5049 5.1470 88000 3.3216
3.4917 5.2055 89000 3.3192
3.4802 5.2640 90000 3.3188
3.4971 5.3225 91000 3.3201
3.4941 5.3810 92000 3.3175
3.4998 5.4395 93000 3.3179
3.5011 5.4980 94000 3.3164
3.4912 5.5564 95000 3.3180
3.4961 5.6149 96000 3.3168
3.4833 5.6734 97000 3.3148
3.498 5.7319 98000 3.3133
3.4892 5.7904 99000 3.3142
3.4967 5.8489 100000 3.3142
3.4847 5.9074 101000 3.3094
3.4899 5.9659 102000 3.3102
3.4774 6.0244 103000 3.3110
3.4854 6.0828 104000 3.3106
3.4873 6.1413 105000 3.3087
3.4869 6.1998 106000 3.3102
3.4833 6.2583 107000 3.3063
3.491 6.3168 108000 3.3082
3.4776 6.3753 109000 3.3075
3.4924 6.4338 110000 3.3068
3.4804 6.4923 111000 3.3050
3.4805 6.5508 112000 3.3041
3.4892 6.6093 113000 3.3031
3.4775 6.6677 114000 3.3032
3.481 6.7262 115000 3.3036
3.4782 6.7847 116000 3.3025
3.4804 6.8432 117000 3.3017
3.4841 6.9017 118000 3.2999
3.4784 6.9602 119000 3.3008
3.4821 7.0187 120000 3.3001
3.4671 7.0772 121000 3.3008
3.485 7.1357 122000 3.2976
3.4737 7.1941 123000 3.2985
3.4793 7.2526 124000 3.2979
3.4651 7.3111 125000 3.2968
3.4847 7.3696 126000 3.2974
3.474 7.4281 127000 3.2973
3.4769 7.4866 128000 3.2955
3.486 7.5451 129000 3.2953
3.4684 7.6036 130000 3.2944
3.4826 7.6621 131000 3.2949
3.4685 7.7205 132000 3.2944
3.4608 7.7790 133000 3.2931
3.4655 7.8375 134000 3.2953
3.4648 7.8960 135000 3.2928
3.4632 7.9545 136000 3.2936
3.4666 8.0130 137000 3.2902
3.4663 8.0715 138000 3.2939
3.4713 8.1300 139000 3.2904
3.4654 8.1885 140000 3.2917
3.466 8.2469 141000 3.2913
3.4724 8.3054 142000 3.2889
3.4695 8.3639 143000 3.2890
3.4729 8.4224 144000 3.2876
3.4551 8.4809 145000 3.2898
3.4652 8.5394 146000 3.2885
3.4689 8.5979 147000 3.2854
3.4647 8.6564 148000 3.2857
3.4653 8.7149 149000 3.2857
3.4552 8.7733 150000 3.2861
3.47 8.8318 151000 3.2868
3.4627 8.8903 152000 3.2854

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.0
  • Datasets 3.0.1
  • Tokenizers 0.20.1