Edit model card

distilgpt2-finetuned-quran

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7655

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 500

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 6 2.2725
2.4443 2.0 12 2.2706
2.4443 3.0 18 2.2710
2.4085 4.0 24 2.2695
2.3977 5.0 30 2.2723
2.3977 6.0 36 2.2703
2.3733 7.0 42 2.2724
2.3733 8.0 48 2.2711
2.3597 9.0 54 2.2725
2.3509 10.0 60 2.2734
2.3509 11.0 66 2.2733
2.33 12.0 72 2.2738
2.33 13.0 78 2.2740
2.3211 14.0 84 2.2758
2.3146 15.0 90 2.2739
2.3146 16.0 96 2.2759
2.2978 17.0 102 2.2750
2.2978 18.0 108 2.2754
2.2871 19.0 114 2.2776
2.274 20.0 120 2.2754
2.274 21.0 126 2.2784
2.2679 22.0 132 2.2785
2.2679 23.0 138 2.2802
2.2532 24.0 144 2.2782
2.2424 25.0 150 2.2797
2.2424 26.0 156 2.2804
2.2329 27.0 162 2.2810
2.2329 28.0 168 2.2814
2.2232 29.0 174 2.2826
2.2108 30.0 180 2.2839
2.2108 31.0 186 2.2831
2.2034 32.0 192 2.2827
2.2034 33.0 198 2.2851
2.1868 34.0 204 2.2827
2.1831 35.0 210 2.2862
2.1831 36.0 216 2.2859
2.1745 37.0 222 2.2876
2.1745 38.0 228 2.2880
2.1683 39.0 234 2.2888
2.1487 40.0 240 2.2879
2.1487 41.0 246 2.2902
2.1452 42.0 252 2.2891
2.1452 43.0 258 2.2919
2.1372 44.0 264 2.2913
2.1267 45.0 270 2.2928
2.1267 46.0 276 2.2936
2.1161 47.0 282 2.2957
2.1161 48.0 288 2.2953
2.112 49.0 294 2.2971
2.1012 50.0 300 2.2962
2.1012 51.0 306 2.2985
2.0915 52.0 312 2.2979
2.0915 53.0 318 2.2990
2.0902 54.0 324 2.2999
2.073 55.0 330 2.2991
2.073 56.0 336 2.3033
2.0685 57.0 342 2.3024
2.0685 58.0 348 2.3042
2.0544 59.0 354 2.3057
2.0458 60.0 360 2.3060
2.0458 61.0 366 2.3058
2.0416 62.0 372 2.3097
2.0416 63.0 378 2.3075
2.0344 64.0 384 2.3108
2.0223 65.0 390 2.3121
2.0223 66.0 396 2.3114
2.0131 67.0 402 2.3126
2.0131 68.0 408 2.3159
2.0106 69.0 414 2.3142
2.0002 70.0 420 2.3168
2.0002 71.0 426 2.3182
1.9943 72.0 432 2.3195
1.9943 73.0 438 2.3172
1.9831 74.0 444 2.3203
1.9763 75.0 450 2.3226
1.9763 76.0 456 2.3211
1.9677 77.0 462 2.3245
1.9677 78.0 468 2.3266
1.9606 79.0 474 2.3263
1.9535 80.0 480 2.3299
1.9535 81.0 486 2.3297
1.9438 82.0 492 2.3284
1.9438 83.0 498 2.3311
1.9375 84.0 504 2.3332
1.934 85.0 510 2.3347
1.934 86.0 516 2.3360
1.9246 87.0 522 2.3345
1.9246 88.0 528 2.3396
1.9152 89.0 534 2.3399
1.9102 90.0 540 2.3424
1.9102 91.0 546 2.3403
1.9025 92.0 552 2.3465
1.9025 93.0 558 2.3475
1.8954 94.0 564 2.3438
1.8904 95.0 570 2.3483
1.8904 96.0 576 2.3464
1.8813 97.0 582 2.3503
1.8813 98.0 588 2.3510
1.8781 99.0 594 2.3517
1.8653 100.0 600 2.3558
1.8653 101.0 606 2.3556
1.8592 102.0 612 2.3587
1.8592 103.0 618 2.3556
1.8563 104.0 624 2.3625
1.848 105.0 630 2.3595
1.848 106.0 636 2.3683
1.8414 107.0 642 2.3619
1.8414 108.0 648 2.3638
1.8374 109.0 654 2.3685
1.8285 110.0 660 2.3684
1.8285 111.0 666 2.3694
1.8205 112.0 672 2.3717
1.8205 113.0 678 2.3705
1.8143 114.0 684 2.3776
1.8109 115.0 690 2.3736
1.8109 116.0 696 2.3811
1.8015 117.0 702 2.3783
1.8015 118.0 708 2.3805
1.7927 119.0 714 2.3817
1.7899 120.0 720 2.3869
1.7899 121.0 726 2.3848
1.7832 122.0 732 2.3880
1.7832 123.0 738 2.3856
1.7731 124.0 744 2.3936
1.7704 125.0 750 2.3891
1.7704 126.0 756 2.3940
1.7641 127.0 762 2.3963
1.7641 128.0 768 2.3971
1.7577 129.0 774 2.3985
1.7524 130.0 780 2.4004
1.7524 131.0 786 2.3993
1.7492 132.0 792 2.4023
1.7492 133.0 798 2.4035
1.7406 134.0 804 2.4079
1.7341 135.0 810 2.4074
1.7341 136.0 816 2.4074
1.7275 137.0 822 2.4113
1.7275 138.0 828 2.4140
1.7259 139.0 834 2.4124
1.7184 140.0 840 2.4194
1.7184 141.0 846 2.4138
1.7106 142.0 852 2.4225
1.7106 143.0 858 2.4200
1.7066 144.0 864 2.4239
1.6988 145.0 870 2.4233
1.6988 146.0 876 2.4250
1.6928 147.0 882 2.4276
1.6928 148.0 888 2.4277
1.6916 149.0 894 2.4301
1.686 150.0 900 2.4319
1.686 151.0 906 2.4312
1.6795 152.0 912 2.4393
1.6795 153.0 918 2.4370
1.6716 154.0 924 2.4372
1.6683 155.0 930 2.4428
1.6683 156.0 936 2.4409
1.6622 157.0 942 2.4441
1.6622 158.0 948 2.4458
1.6583 159.0 954 2.4493
1.6518 160.0 960 2.4508
1.6518 161.0 966 2.4494
1.6461 162.0 972 2.4523
1.6461 163.0 978 2.4546
1.6436 164.0 984 2.4526
1.6373 165.0 990 2.4632
1.6373 166.0 996 2.4607
1.6307 167.0 1002 2.4610
1.6307 168.0 1008 2.4641
1.6257 169.0 1014 2.4610
1.6226 170.0 1020 2.4672
1.6226 171.0 1026 2.4639
1.6159 172.0 1032 2.4693
1.6159 173.0 1038 2.4718
1.6125 174.0 1044 2.4718
1.6077 175.0 1050 2.4731
1.6077 176.0 1056 2.4774
1.6057 177.0 1062 2.4813
1.6057 178.0 1068 2.4747
1.5932 179.0 1074 2.4817
1.5925 180.0 1080 2.4799
1.5925 181.0 1086 2.4845
1.5868 182.0 1092 2.4837
1.5868 183.0 1098 2.4883
1.5834 184.0 1104 2.4895
1.5802 185.0 1110 2.4896
1.5802 186.0 1116 2.4957
1.5728 187.0 1122 2.4905
1.5728 188.0 1128 2.4967
1.5709 189.0 1134 2.4962
1.5644 190.0 1140 2.4999
1.5644 191.0 1146 2.5023
1.5615 192.0 1152 2.5034
1.5615 193.0 1158 2.5054
1.5576 194.0 1164 2.5086
1.5466 195.0 1170 2.5059
1.5466 196.0 1176 2.5098
1.5431 197.0 1182 2.5104
1.5431 198.0 1188 2.5123
1.5457 199.0 1194 2.5129
1.5397 200.0 1200 2.5174
1.5397 201.0 1206 2.5188
1.5336 202.0 1212 2.5145
1.5336 203.0 1218 2.5202
1.5285 204.0 1224 2.5192
1.5271 205.0 1230 2.5224
1.5271 206.0 1236 2.5246
1.5204 207.0 1242 2.5233
1.5204 208.0 1248 2.5261
1.5189 209.0 1254 2.5296
1.5144 210.0 1260 2.5317
1.5144 211.0 1266 2.5314
1.5097 212.0 1272 2.5321
1.5097 213.0 1278 2.5355
1.5073 214.0 1284 2.5382
1.5007 215.0 1290 2.5378
1.5007 216.0 1296 2.5429
1.4951 217.0 1302 2.5433
1.4951 218.0 1308 2.5404
1.4928 219.0 1314 2.5459
1.4927 220.0 1320 2.5477
1.4927 221.0 1326 2.5438
1.4855 222.0 1332 2.5498
1.4855 223.0 1338 2.5500
1.4877 224.0 1344 2.5513
1.474 225.0 1350 2.5509
1.474 226.0 1356 2.5555
1.4713 227.0 1362 2.5570
1.4713 228.0 1368 2.5592
1.4658 229.0 1374 2.5598
1.4687 230.0 1380 2.5647
1.4687 231.0 1386 2.5613
1.4631 232.0 1392 2.5666
1.4631 233.0 1398 2.5650
1.4614 234.0 1404 2.5695
1.4565 235.0 1410 2.5667
1.4565 236.0 1416 2.5772
1.4506 237.0 1422 2.5670
1.4506 238.0 1428 2.5756
1.4533 239.0 1434 2.5768
1.443 240.0 1440 2.5767
1.443 241.0 1446 2.5816
1.4419 242.0 1452 2.5793
1.4419 243.0 1458 2.5824
1.4405 244.0 1464 2.5778
1.4347 245.0 1470 2.5866
1.4347 246.0 1476 2.5873
1.434 247.0 1482 2.5871
1.434 248.0 1488 2.5892
1.4268 249.0 1494 2.5887
1.4275 250.0 1500 2.5887
1.4275 251.0 1506 2.5940
1.4215 252.0 1512 2.5921
1.4215 253.0 1518 2.5948
1.4195 254.0 1524 2.5941
1.4188 255.0 1530 2.6004
1.4188 256.0 1536 2.6004
1.4097 257.0 1542 2.6009
1.4097 258.0 1548 2.6020
1.4104 259.0 1554 2.6064
1.4054 260.0 1560 2.6039
1.4054 261.0 1566 2.6018
1.4037 262.0 1572 2.6099
1.4037 263.0 1578 2.6100
1.3965 264.0 1584 2.6086
1.4042 265.0 1590 2.6119
1.4042 266.0 1596 2.6155
1.3917 267.0 1602 2.6113
1.3917 268.0 1608 2.6189
1.394 269.0 1614 2.6155
1.3877 270.0 1620 2.6199
1.3877 271.0 1626 2.6170
1.3862 272.0 1632 2.6183
1.3862 273.0 1638 2.6242
1.3833 274.0 1644 2.6262
1.377 275.0 1650 2.6247
1.377 276.0 1656 2.6256
1.376 277.0 1662 2.6294
1.376 278.0 1668 2.6288
1.3725 279.0 1674 2.6307
1.3738 280.0 1680 2.6306
1.3738 281.0 1686 2.6307
1.3672 282.0 1692 2.6353
1.3672 283.0 1698 2.6345
1.3709 284.0 1704 2.6354
1.3637 285.0 1710 2.6356
1.3637 286.0 1716 2.6373
1.3592 287.0 1722 2.6352
1.3592 288.0 1728 2.6404
1.3627 289.0 1734 2.6431
1.3511 290.0 1740 2.6435
1.3511 291.0 1746 2.6456
1.3535 292.0 1752 2.6459
1.3535 293.0 1758 2.6463
1.3523 294.0 1764 2.6476
1.3461 295.0 1770 2.6470
1.3461 296.0 1776 2.6477
1.345 297.0 1782 2.6479
1.345 298.0 1788 2.6540
1.3429 299.0 1794 2.6575
1.3418 300.0 1800 2.6538
1.3418 301.0 1806 2.6554
1.3378 302.0 1812 2.6569
1.3378 303.0 1818 2.6597
1.3337 304.0 1824 2.6598
1.3316 305.0 1830 2.6592
1.3316 306.0 1836 2.6602
1.3322 307.0 1842 2.6627
1.3322 308.0 1848 2.6603
1.3217 309.0 1854 2.6683
1.3281 310.0 1860 2.6667
1.3281 311.0 1866 2.6644
1.3227 312.0 1872 2.6637
1.3227 313.0 1878 2.6723
1.3218 314.0 1884 2.6686
1.3221 315.0 1890 2.6709
1.3221 316.0 1896 2.6686
1.3153 317.0 1902 2.6725
1.3153 318.0 1908 2.6779
1.3156 319.0 1914 2.6753
1.3159 320.0 1920 2.6725
1.3159 321.0 1926 2.6762
1.314 322.0 1932 2.6766
1.314 323.0 1938 2.6797
1.3098 324.0 1944 2.6785
1.3053 325.0 1950 2.6807
1.3053 326.0 1956 2.6821
1.3051 327.0 1962 2.6798
1.3051 328.0 1968 2.6853
1.2992 329.0 1974 2.6837
1.303 330.0 1980 2.6839
1.303 331.0 1986 2.6864
1.2998 332.0 1992 2.6867
1.2998 333.0 1998 2.6914
1.2952 334.0 2004 2.6914
1.2947 335.0 2010 2.6881
1.2947 336.0 2016 2.6926
1.2899 337.0 2022 2.6905
1.2899 338.0 2028 2.6939
1.2908 339.0 2034 2.6945
1.2923 340.0 2040 2.6971
1.2923 341.0 2046 2.6949
1.2898 342.0 2052 2.6942
1.2898 343.0 2058 2.6963
1.2828 344.0 2064 2.6967
1.2823 345.0 2070 2.6995
1.2823 346.0 2076 2.7012
1.2809 347.0 2082 2.7000
1.2809 348.0 2088 2.6992
1.2779 349.0 2094 2.7015
1.2801 350.0 2100 2.7021
1.2801 351.0 2106 2.7049
1.2786 352.0 2112 2.7025
1.2786 353.0 2118 2.7037
1.2726 354.0 2124 2.7072
1.2713 355.0 2130 2.7090
1.2713 356.0 2136 2.7104
1.2692 357.0 2142 2.7069
1.2692 358.0 2148 2.7089
1.2705 359.0 2154 2.7109
1.2671 360.0 2160 2.7112
1.2671 361.0 2166 2.7122
1.2673 362.0 2172 2.7117
1.2673 363.0 2178 2.7131
1.2676 364.0 2184 2.7131
1.2624 365.0 2190 2.7175
1.2624 366.0 2196 2.7171
1.2625 367.0 2202 2.7172
1.2625 368.0 2208 2.7165
1.2596 369.0 2214 2.7153
1.2591 370.0 2220 2.7182
1.2591 371.0 2226 2.7182
1.2601 372.0 2232 2.7171
1.2601 373.0 2238 2.7230
1.2534 374.0 2244 2.7245
1.2533 375.0 2250 2.7235
1.2533 376.0 2256 2.7208
1.2515 377.0 2262 2.7230
1.2515 378.0 2268 2.7252
1.2505 379.0 2274 2.7252
1.2498 380.0 2280 2.7258
1.2498 381.0 2286 2.7276
1.2527 382.0 2292 2.7269
1.2527 383.0 2298 2.7282
1.244 384.0 2304 2.7267
1.2497 385.0 2310 2.7266
1.2497 386.0 2316 2.7284
1.2436 387.0 2322 2.7293
1.2436 388.0 2328 2.7306
1.2445 389.0 2334 2.7299
1.2424 390.0 2340 2.7304
1.2424 391.0 2346 2.7337
1.2416 392.0 2352 2.7316
1.2416 393.0 2358 2.7309
1.2408 394.0 2364 2.7349
1.2364 395.0 2370 2.7377
1.2364 396.0 2376 2.7374
1.2401 397.0 2382 2.7354
1.2401 398.0 2388 2.7363
1.2363 399.0 2394 2.7366
1.2333 400.0 2400 2.7380
1.2333 401.0 2406 2.7404
1.2328 402.0 2412 2.7387
1.2328 403.0 2418 2.7363
1.2339 404.0 2424 2.7385
1.233 405.0 2430 2.7402
1.233 406.0 2436 2.7417
1.2315 407.0 2442 2.7416
1.2315 408.0 2448 2.7406
1.231 409.0 2454 2.7422
1.2324 410.0 2460 2.7425
1.2324 411.0 2466 2.7440
1.2279 412.0 2472 2.7460
1.2279 413.0 2478 2.7459
1.2281 414.0 2484 2.7442
1.2279 415.0 2490 2.7454
1.2279 416.0 2496 2.7447
1.227 417.0 2502 2.7464
1.227 418.0 2508 2.7466
1.2293 419.0 2514 2.7468
1.2202 420.0 2520 2.7474
1.2202 421.0 2526 2.7472
1.2233 422.0 2532 2.7484
1.2233 423.0 2538 2.7507
1.2228 424.0 2544 2.7508
1.2202 425.0 2550 2.7510
1.2202 426.0 2556 2.7513
1.2183 427.0 2562 2.7509
1.2183 428.0 2568 2.7512
1.2179 429.0 2574 2.7523
1.222 430.0 2580 2.7518
1.222 431.0 2586 2.7521
1.2196 432.0 2592 2.7518
1.2196 433.0 2598 2.7528
1.2151 434.0 2604 2.7529
1.2189 435.0 2610 2.7524
1.2189 436.0 2616 2.7536
1.2159 437.0 2622 2.7537
1.2159 438.0 2628 2.7541
1.2172 439.0 2634 2.7545
1.2171 440.0 2640 2.7565
1.2171 441.0 2646 2.7567
1.2154 442.0 2652 2.7570
1.2154 443.0 2658 2.7563
1.2138 444.0 2664 2.7569
1.2117 445.0 2670 2.7575
1.2117 446.0 2676 2.7565
1.2121 447.0 2682 2.7562
1.2121 448.0 2688 2.7567
1.2139 449.0 2694 2.7582
1.21 450.0 2700 2.7599
1.21 451.0 2706 2.7606
1.2107 452.0 2712 2.7602
1.2107 453.0 2718 2.7606
1.2086 454.0 2724 2.7610
1.2065 455.0 2730 2.7612
1.2065 456.0 2736 2.7617
1.2068 457.0 2742 2.7610
1.2068 458.0 2748 2.7599
1.2091 459.0 2754 2.7601
1.2079 460.0 2760 2.7612
1.2079 461.0 2766 2.7622
1.206 462.0 2772 2.7627
1.206 463.0 2778 2.7629
1.2082 464.0 2784 2.7621
1.2074 465.0 2790 2.7621
1.2074 466.0 2796 2.7631
1.2066 467.0 2802 2.7636
1.2066 468.0 2808 2.7629
1.2062 469.0 2814 2.7620
1.205 470.0 2820 2.7620
1.205 471.0 2826 2.7625
1.2065 472.0 2832 2.7633
1.2065 473.0 2838 2.7637
1.2073 474.0 2844 2.7636
1.2036 475.0 2850 2.7640
1.2036 476.0 2856 2.7639
1.2056 477.0 2862 2.7634
1.2056 478.0 2868 2.7630
1.2061 479.0 2874 2.7636
1.207 480.0 2880 2.7640
1.207 481.0 2886 2.7643
1.2036 482.0 2892 2.7647
1.2036 483.0 2898 2.7649
1.2081 484.0 2904 2.7649
1.2031 485.0 2910 2.7646
1.2031 486.0 2916 2.7643
1.2018 487.0 2922 2.7642
1.2018 488.0 2928 2.7647
1.2027 489.0 2934 2.7650
1.2039 490.0 2940 2.7654
1.2039 491.0 2946 2.7655
1.2047 492.0 2952 2.7655
1.2047 493.0 2958 2.7655
1.2008 494.0 2964 2.7655
1.2025 495.0 2970 2.7655
1.2025 496.0 2976 2.7655
1.2028 497.0 2982 2.7656
1.2028 498.0 2988 2.7656
1.2003 499.0 2994 2.7655
1.2061 500.0 3000 2.7655

Framework versions

  • Transformers 4.30.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.13.3
Downloads last month
82