Edit model card

gpt2-finetuned-justification-v5

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3678
  • Rouge1: 28.2558
  • Rouge2: 13.2942
  • Rougel: 20.5646
  • Rougelsum: 25.4960

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
No log 1.0 338 0.2009 31.3319 13.9488 23.6685 29.4548
0.2408 2.0 676 0.1974 29.1645 13.8230 21.7497 26.8219
0.1816 3.0 1014 0.1970 29.1922 13.8428 21.7595 26.8727
0.1816 4.0 1352 0.1976 31.8894 13.6293 23.2637 29.1353
0.1594 5.0 1690 0.1999 28.9161 13.7629 21.7786 26.6707
0.1417 6.0 2028 0.2019 26.1827 13.0883 21.6704 25.0609
0.1417 7.0 2366 0.2052 32.5679 14.4370 23.6445 29.7651
0.1287 8.0 2704 0.2093 30.4390 13.4570 22.2407 27.9234
0.1148 9.0 3042 0.2140 26.3746 13.2671 21.7004 25.0909
0.1148 10.0 3380 0.2182 30.4390 13.4570 22.2407 27.9234
0.0996 11.0 3718 0.2235 28.9954 13.7629 21.8212 26.7486
0.0918 12.0 4056 0.2292 29.6633 13.5103 21.9731 27.1570
0.0918 13.0 4394 0.2344 32.1781 13.7632 22.8996 29.0738
0.0783 14.0 4732 0.2369 27.9019 13.5715 21.1971 25.6530
0.0697 15.0 5070 0.2448 30.3482 13.6242 22.1001 27.5802
0.0697 16.0 5408 0.2478 32.2060 13.7632 22.8806 29.0774
0.0617 17.0 5746 0.2511 30.3482 13.6242 22.1001 27.5802
0.0547 18.0 6084 0.2562 30.3482 13.6242 22.1001 27.5802
0.0547 19.0 6422 0.2614 31.7077 14.1430 23.3427 29.2428
0.0486 20.0 6760 0.2619 30.3691 13.6839 22.1309 27.5877
0.0431 21.0 7098 0.2666 30.3482 13.6242 22.1001 27.5802
0.0431 22.0 7436 0.2661 32.1647 13.7632 22.8606 29.0573
0.0398 23.0 7774 0.2710 28.3033 13.7723 21.5098 26.0225
0.0356 24.0 8112 0.2743 30.3482 13.6242 22.1001 27.5802
0.0356 25.0 8450 0.2729 30.3482 13.6242 22.1001 27.5802
0.033 26.0 8788 0.2761 30.3482 13.6242 22.1001 27.5802
0.03 27.0 9126 0.2782 30.3482 13.6242 22.1001 27.5802
0.03 28.0 9464 0.2821 27.7803 13.2224 20.9152 25.2690
0.0274 29.0 9802 0.2840 27.7800 13.4113 20.6604 25.5476
0.0257 30.0 10140 0.2855 30.2852 13.6008 22.0867 27.5376
0.0257 31.0 10478 0.2878 30.3482 13.6242 22.1001 27.5802
0.0238 32.0 10816 0.2888 30.3482 13.6242 22.1001 27.5802
0.0219 33.0 11154 0.2872 30.3482 13.6242 22.1001 27.5802
0.0219 34.0 11492 0.2905 30.3482 13.6242 22.1001 27.5802
0.0211 35.0 11830 0.2926 30.2097 13.6132 22.0455 27.4601
0.0192 36.0 12168 0.2961 26.7582 13.4935 20.6488 24.7462
0.0186 37.0 12506 0.2984 26.7089 12.7390 20.4116 24.8107
0.0186 38.0 12844 0.2955 30.3482 13.6242 22.1001 27.5802
0.0175 39.0 13182 0.2985 30.3545 13.8737 22.2152 27.6296
0.017 40.0 13520 0.3025 30.3545 13.8737 22.2152 27.6296
0.017 41.0 13858 0.3038 30.5505 13.0272 21.9910 27.6810
0.0158 42.0 14196 0.3042 30.3545 13.8737 22.2152 27.6296
0.0156 43.0 14534 0.3048 26.7070 13.5119 20.6281 24.6882
0.0156 44.0 14872 0.3061 30.3440 13.8737 22.1997 27.6126
0.0147 45.0 15210 0.3081 26.7070 13.5119 20.6281 24.6882
0.0141 46.0 15548 0.3133 26.7414 13.5119 20.6325 24.7363
0.0141 47.0 15886 0.3115 30.3482 13.6242 22.1001 27.5802
0.0135 48.0 16224 0.3131 26.5987 13.0206 20.4843 24.9550
0.0131 49.0 16562 0.3142 27.9816 13.6880 21.2917 25.7438
0.0131 50.0 16900 0.3161 26.9511 13.3418 20.9442 25.0861
0.0128 51.0 17238 0.3157 26.4405 12.8498 19.8095 23.8804
0.0123 52.0 17576 0.3169 26.4482 12.8554 19.8262 23.8895
0.0123 53.0 17914 0.3162 27.4677 13.5119 21.0011 25.2709
0.0121 54.0 18252 0.3192 26.4405 12.8498 19.8095 23.8804
0.012 55.0 18590 0.3192 27.4743 13.5029 21.0202 25.2768
0.012 56.0 18928 0.3217 28.7538 13.6889 21.2448 26.2972
0.0116 57.0 19266 0.3221 30.3482 13.6242 22.1001 27.5802
0.0112 58.0 19604 0.3214 27.4677 13.5119 21.0011 25.2709
0.0112 59.0 19942 0.3256 30.3482 13.6242 22.1001 27.5802
0.011 60.0 20280 0.3246 30.3482 13.6242 22.1001 27.5802
0.0107 61.0 20618 0.3269 26.4008 12.8554 19.8262 23.8411
0.0107 62.0 20956 0.3262 30.3482 13.6242 22.1001 27.5802
0.0107 63.0 21294 0.3262 26.4405 12.8498 19.8095 23.8804
0.0104 64.0 21632 0.3313 26.4008 12.8554 19.8262 23.8411
0.0104 65.0 21970 0.3301 26.4405 12.8498 19.8095 23.8804
0.0102 66.0 22308 0.3334 27.0875 13.1212 20.9683 24.8767
0.01 67.0 22646 0.3307 27.1356 13.1167 20.9787 24.9210
0.01 68.0 22984 0.3351 26.4482 12.8554 19.8262 23.8895
0.0101 69.0 23322 0.3334 28.2990 13.2942 20.5684 25.5223
0.0098 70.0 23660 0.3337 27.4743 13.5029 21.0202 25.2768
0.0098 71.0 23998 0.3320 26.5357 12.9745 20.4634 24.8632
0.0097 72.0 24336 0.3371 26.4405 12.8498 19.8095 23.8804
0.0094 73.0 24674 0.3365 28.2536 13.2942 20.5684 25.4770
0.0096 74.0 25012 0.3334 27.4677 13.5119 21.0011 25.2709
0.0096 75.0 25350 0.3401 26.4008 12.8554 19.8262 23.8411
0.0094 76.0 25688 0.3369 27.0875 13.1212 20.9683 24.8767
0.0092 77.0 26026 0.3379 28.2558 13.2942 20.5646 25.4960
0.0092 78.0 26364 0.3402 26.4405 12.8498 19.8095 23.8804
0.0091 79.0 26702 0.3394 29.8901 13.3237 21.5272 26.8283
0.0091 80.0 27040 0.3381 20.8422 11.5482 18.3383 20.0148
0.0091 81.0 27378 0.3375 28.2558 13.2942 20.5646 25.4960
0.009 82.0 27716 0.3382 28.2536 13.2942 20.5684 25.4770
0.0088 83.0 28054 0.3393 27.4677 13.5119 21.0011 25.2709
0.0088 84.0 28392 0.3412 27.2358 13.8221 21.2686 25.1893
0.0087 85.0 28730 0.3473 26.4405 12.8498 19.8095 23.8804
0.0088 86.0 29068 0.3433 28.2902 13.2884 20.5599 25.5133
0.0088 87.0 29406 0.3433 30.3482 13.6242 22.1001 27.5802
0.0086 88.0 29744 0.3430 27.0875 13.1212 20.9683 24.8767
0.0086 89.0 30082 0.3465 30.3482 13.6242 22.1001 27.5802
0.0086 90.0 30420 0.3444 26.4611 12.8498 19.8370 23.8945
0.0085 91.0 30758 0.3480 26.4243 12.8644 19.8396 23.8598
0.0085 92.0 31096 0.3462 26.4405 12.8498 19.8095 23.8804
0.0085 93.0 31434 0.3458 30.3482 13.6242 22.1001 27.5802
0.0084 94.0 31772 0.3433 30.3482 13.6242 22.1001 27.5802
0.0084 95.0 32110 0.3468 26.4405 12.8498 19.8095 23.8804
0.0084 96.0 32448 0.3453 19.7830 7.7026 16.7373 18.8332
0.0083 97.0 32786 0.3499 21.9400 11.8071 19.0004 20.7038
0.0082 98.0 33124 0.3509 28.9561 13.0484 21.4552 26.5578
0.0082 99.0 33462 0.3493 28.2536 13.2942 20.5684 25.4770
0.0082 100.0 33800 0.3505 27.1066 13.1324 20.9784 24.8951
0.0082 101.0 34138 0.3482 27.0875 13.1212 20.9683 24.8767
0.0082 102.0 34476 0.3497 27.0875 13.1212 20.9683 24.8767
0.0082 103.0 34814 0.3517 27.0875 13.1212 20.9683 24.8767
0.0081 104.0 35152 0.3529 30.3100 13.6441 22.1010 27.5391
0.0081 105.0 35490 0.3490 26.4405 12.8498 19.8095 23.8804
0.0081 106.0 35828 0.3524 30.3282 13.6242 22.1001 27.5644
0.0079 107.0 36166 0.3514 28.2536 13.2942 20.5684 25.4770
0.0081 108.0 36504 0.3534 30.3482 13.6242 22.1001 27.5802
0.0081 109.0 36842 0.3518 28.3079 13.2884 20.5813 25.5310
0.0078 110.0 37180 0.3538 30.3482 13.6242 22.1001 27.5802
0.0079 111.0 37518 0.3567 27.2358 13.8221 21.2686 25.1893
0.0079 112.0 37856 0.3517 30.3482 13.6242 22.1001 27.5802
0.0078 113.0 38194 0.3542 27.2358 13.8221 21.2686 25.1893
0.0078 114.0 38532 0.3558 30.3482 13.6242 22.1001 27.5802
0.0078 115.0 38870 0.3571 28.3079 13.2884 20.5813 25.5310
0.0077 116.0 39208 0.3566 26.4405 12.8498 19.8095 23.8804
0.0077 117.0 39546 0.3590 27.0875 13.1212 20.9683 24.8767
0.0077 118.0 39884 0.3574 23.4456 12.1733 19.9307 22.2553
0.0076 119.0 40222 0.3563 26.4405 12.8498 19.8095 23.8804
0.0077 120.0 40560 0.3547 26.4405 12.8498 19.8095 23.8804
0.0077 121.0 40898 0.3590 26.4611 12.8498 19.8370 23.8945
0.0076 122.0 41236 0.3559 22.1818 12.0059 19.2020 20.9696
0.0076 123.0 41574 0.3529 30.3482 13.6242 22.1001 27.5802
0.0076 124.0 41912 0.3566 30.3482 13.6242 22.1001 27.5802
0.0076 125.0 42250 0.3586 26.4243 12.8644 19.8396 23.8598
0.0076 126.0 42588 0.3562 26.4405 12.8498 19.8095 23.8804
0.0076 127.0 42926 0.3594 28.2558 13.2942 20.5646 25.4960
0.0075 128.0 43264 0.3575 30.3482 13.6242 22.1001 27.5802
0.0075 129.0 43602 0.3536 30.3482 13.6242 22.1001 27.5802
0.0075 130.0 43940 0.3566 28.2536 13.2942 20.5684 25.4770
0.0074 131.0 44278 0.3591 30.3482 13.6242 22.1001 27.5802
0.0075 132.0 44616 0.3576 30.3482 13.6242 22.1001 27.5802
0.0075 133.0 44954 0.3573 26.4611 12.8498 19.8370 23.8945
0.0075 134.0 45292 0.3580 26.4008 12.8554 19.8262 23.8411
0.0075 135.0 45630 0.3584 26.4008 12.8554 19.8262 23.8411
0.0075 136.0 45968 0.3584 27.0875 13.1212 20.9683 24.8767
0.0074 137.0 46306 0.3591 30.3482 13.6242 22.1001 27.5802
0.0074 138.0 46644 0.3604 28.3079 13.2884 20.5813 25.5310
0.0074 139.0 46982 0.3624 25.6025 13.3836 19.9214 23.2847
0.0074 140.0 47320 0.3598 28.2536 13.2942 20.5684 25.4770
0.0073 141.0 47658 0.3604 26.4405 12.8498 19.8095 23.8804
0.0073 142.0 47996 0.3613 26.3898 12.8644 19.8160 23.8305
0.0074 143.0 48334 0.3614 28.2536 13.2942 20.5684 25.4770
0.0074 144.0 48672 0.3615 28.2558 13.2942 20.5646 25.4960
0.0073 145.0 49010 0.3608 28.2536 13.2942 20.5684 25.4770
0.0073 146.0 49348 0.3616 26.4405 12.8498 19.8095 23.8804
0.0072 147.0 49686 0.3652 28.9561 13.0484 21.4552 26.5578
0.0073 148.0 50024 0.3632 28.2536 13.2942 20.5684 25.4770
0.0073 149.0 50362 0.3603 27.0875 13.1212 20.9683 24.8767
0.0073 150.0 50700 0.3608 26.3919 12.8457 19.8638 23.8486
0.0072 151.0 51038 0.3614 26.3919 12.8457 19.8638 23.8486
0.0072 152.0 51376 0.3624 28.2536 13.2942 20.5684 25.4770
0.0071 153.0 51714 0.3615 26.4008 12.8554 19.8262 23.8411
0.0071 154.0 52052 0.3636 26.4405 12.8498 19.8095 23.8804
0.0071 155.0 52390 0.3646 30.3482 13.6242 22.1001 27.5802
0.0072 156.0 52728 0.3656 26.4405 12.8498 19.8095 23.8804
0.0071 157.0 53066 0.3653 26.4405 12.8498 19.8095 23.8804
0.0071 158.0 53404 0.3644 26.4405 12.8498 19.8095 23.8804
0.0071 159.0 53742 0.3648 26.4611 12.8498 19.8370 23.8945
0.0071 160.0 54080 0.3616 28.2107 13.2942 20.5662 25.4111
0.0071 161.0 54418 0.3629 28.2536 13.2942 20.5684 25.4770
0.0071 162.0 54756 0.3647 28.2558 13.2942 20.5646 25.4960
0.007 163.0 55094 0.3636 28.2558 13.2942 20.5646 25.4960
0.007 164.0 55432 0.3650 26.4405 12.8498 19.8095 23.8804
0.007 165.0 55770 0.3663 28.2558 13.2942 20.5646 25.4960
0.007 166.0 56108 0.3659 28.2558 13.2942 20.5646 25.4960
0.007 167.0 56446 0.3676 28.2558 13.2942 20.5646 25.4960
0.0069 168.0 56784 0.3659 28.2558 13.2942 20.5646 25.4960
0.0069 169.0 57122 0.3674 28.2536 13.2942 20.5684 25.4770
0.0069 170.0 57460 0.3662 26.4405 12.8498 19.8095 23.8804
0.007 171.0 57798 0.3651 26.4405 12.8498 19.8095 23.8804
0.007 172.0 58136 0.3670 26.4405 12.8498 19.8095 23.8804
0.007 173.0 58474 0.3666 26.4405 12.8498 19.8095 23.8804
0.007 174.0 58812 0.3684 26.5255 12.8696 19.9233 24.0077
0.0069 175.0 59150 0.3682 28.2902 13.2884 20.5599 25.5133
0.0069 176.0 59488 0.3681 28.3079 13.2884 20.5813 25.5310
0.0069 177.0 59826 0.3687 28.2536 13.2942 20.5684 25.4770
0.0069 178.0 60164 0.3691 28.2558 13.2942 20.5646 25.4960
0.0069 179.0 60502 0.3656 26.4008 12.8554 19.8262 23.8411
0.0069 180.0 60840 0.3664 28.2536 13.2942 20.5684 25.4770
0.0069 181.0 61178 0.3663 28.2558 13.2942 20.5646 25.4960
0.0069 182.0 61516 0.3667 28.2558 13.2942 20.5646 25.4960
0.0069 183.0 61854 0.3658 28.2558 13.2942 20.5646 25.4960
0.0069 184.0 62192 0.3671 28.2558 13.2942 20.5646 25.4960
0.0068 185.0 62530 0.3686 28.2558 13.2942 20.5646 25.4960
0.0068 186.0 62868 0.3670 28.2652 13.2942 20.5234 25.4470
0.0068 187.0 63206 0.3667 28.2558 13.2942 20.5646 25.4960
0.0068 188.0 63544 0.3669 28.2558 13.2942 20.5646 25.4960
0.0068 189.0 63882 0.3676 28.2558 13.2942 20.5646 25.4960
0.0067 190.0 64220 0.3675 28.2558 13.2942 20.5646 25.4960
0.0068 191.0 64558 0.3680 28.2558 13.2942 20.5646 25.4960
0.0068 192.0 64896 0.3681 28.2558 13.2942 20.5646 25.4960
0.007 193.0 65234 0.3675 28.2558 13.2942 20.5646 25.4960
0.0068 194.0 65572 0.3675 28.2558 13.2942 20.5646 25.4960
0.0068 195.0 65910 0.3674 28.2558 13.2942 20.5646 25.4960
0.0068 196.0 66248 0.3679 28.2558 13.2942 20.5646 25.4960
0.0068 197.0 66586 0.3678 28.2558 13.2942 20.5646 25.4960
0.0068 198.0 66924 0.3677 28.2558 13.2942 20.5646 25.4960
0.0067 199.0 67262 0.3678 28.2558 13.2942 20.5646 25.4960
0.0068 200.0 67600 0.3678 28.2558 13.2942 20.5646 25.4960

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.2.2+cu121
  • Datasets 2.16.0
  • Tokenizers 0.15.2
Downloads last month
12
Safetensors
Model size
301M params
Tensor type
F32
·