Edit model card

long-t5-local-base-finetuned-justification-v09

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3147

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-07
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 338 20.0779
26.617 2.0 676 17.6054
22.6857 3.0 1014 15.1205
22.6857 4.0 1352 12.4837
18.639 5.0 1690 9.9114
14.4577 6.0 2028 8.0629
14.4577 7.0 2366 7.5255
10.7004 8.0 2704 7.4006
7.8669 9.0 3042 7.2827
7.8669 10.0 3380 7.1306
6.3058 11.0 3718 6.9313
5.3507 12.0 4056 6.6880
5.3507 13.0 4394 6.3980
5.0661 14.0 4732 6.1019
4.6576 15.0 5070 5.7985
4.6576 16.0 5408 5.4902
4.374 17.0 5746 5.2013
4.1022 18.0 6084 4.9162
4.1022 19.0 6422 4.6802
3.9773 20.0 6760 4.4889
3.7391 21.0 7098 4.3299
3.7391 22.0 7436 4.2127
3.6007 23.0 7774 4.1193
3.472 24.0 8112 4.0468
3.472 25.0 8450 3.9895
3.3327 26.0 8788 3.9357
3.3196 27.0 9126 3.8895
3.3196 28.0 9464 3.8449
3.229 29.0 9802 3.8026
3.1795 30.0 10140 3.7613
3.1795 31.0 10478 3.7200
3.0775 32.0 10816 3.6811
3.065 33.0 11154 3.6424
3.065 34.0 11492 3.6048
3.0145 35.0 11830 3.5750
2.9987 36.0 12168 3.5381
2.9096 37.0 12506 3.5031
2.9096 38.0 12844 3.4699
2.8816 39.0 13182 3.4402
2.8767 40.0 13520 3.4116
2.8767 41.0 13858 3.3847
2.8189 42.0 14196 3.3540
2.8297 43.0 14534 3.3275
2.8297 44.0 14872 3.3008
2.7376 45.0 15210 3.2745
2.7519 46.0 15548 3.2521
2.7519 47.0 15886 3.2273
2.7207 48.0 16224 3.2038
2.7056 49.0 16562 3.1822
2.7056 50.0 16900 3.1619
2.6539 51.0 17238 3.1426
2.6393 52.0 17576 3.1219
2.6393 53.0 17914 3.1015
2.6396 54.0 18252 3.0818
2.6029 55.0 18590 3.0604
2.6029 56.0 18928 3.0448
2.5527 57.0 19266 3.0251
2.5793 58.0 19604 3.0069
2.5793 59.0 19942 2.9911
2.5443 60.0 20280 2.9724
2.5083 61.0 20618 2.9560
2.5083 62.0 20956 2.9387
2.5368 63.0 21294 2.9205
2.4771 64.0 21632 2.9040
2.4771 65.0 21970 2.8895
2.4875 66.0 22308 2.8701
2.4532 67.0 22646 2.8570
2.4532 68.0 22984 2.8397
2.4276 69.0 23322 2.8243
2.4279 70.0 23660 2.8110
2.4279 71.0 23998 2.7950
2.3944 72.0 24336 2.7816
2.3907 73.0 24674 2.7704
2.4014 74.0 25012 2.7564
2.4014 75.0 25350 2.7423
2.3698 76.0 25688 2.7295
2.3408 77.0 26026 2.7172
2.3408 78.0 26364 2.7046
2.3404 79.0 26702 2.6916
2.316 80.0 27040 2.6827
2.316 81.0 27378 2.6706
2.3322 82.0 27716 2.6607
2.3005 83.0 28054 2.6500
2.3005 84.0 28392 2.6408
2.2661 85.0 28730 2.6315
2.2946 86.0 29068 2.6231
2.2946 87.0 29406 2.6131
2.2493 88.0 29744 2.6034
2.2623 89.0 30082 2.5940
2.2623 90.0 30420 2.5857
2.2464 91.0 30758 2.5777
2.2203 92.0 31096 2.5714
2.2203 93.0 31434 2.5641
2.233 94.0 31772 2.5562
2.2101 95.0 32110 2.5493
2.2101 96.0 32448 2.5435
2.2321 97.0 32786 2.5376
2.1743 98.0 33124 2.5304
2.1743 99.0 33462 2.5253
2.2033 100.0 33800 2.5202
2.1874 101.0 34138 2.5154
2.1874 102.0 34476 2.5092
2.1615 103.0 34814 2.5054
2.1565 104.0 35152 2.5001
2.1565 105.0 35490 2.4950
2.152 106.0 35828 2.4897
2.1398 107.0 36166 2.4851
2.1424 108.0 36504 2.4812
2.1424 109.0 36842 2.4767
2.1272 110.0 37180 2.4734
2.1171 111.0 37518 2.4686
2.1171 112.0 37856 2.4649
2.1325 113.0 38194 2.4597
2.0975 114.0 38532 2.4567
2.0975 115.0 38870 2.4523
2.1156 116.0 39208 2.4487
2.0628 117.0 39546 2.4452
2.0628 118.0 39884 2.4417
2.1061 119.0 40222 2.4385
2.0897 120.0 40560 2.4343
2.0897 121.0 40898 2.4316
2.083 122.0 41236 2.4271
2.0693 123.0 41574 2.4241
2.0693 124.0 41912 2.4212
2.0748 125.0 42250 2.4180
2.0497 126.0 42588 2.4152
2.0497 127.0 42926 2.4128
2.0803 128.0 43264 2.4098
2.0701 129.0 43602 2.4060
2.0701 130.0 43940 2.4032
2.0358 131.0 44278 2.4010
2.0487 132.0 44616 2.3981
2.0487 133.0 44954 2.3956
2.0402 134.0 45292 2.3927
2.0425 135.0 45630 2.3895
2.0425 136.0 45968 2.3873
2.0379 137.0 46306 2.3844
2.0297 138.0 46644 2.3818
2.0297 139.0 46982 2.3785
2.046 140.0 47320 2.3766
2.0066 141.0 47658 2.3739
2.0066 142.0 47996 2.3712
2.0186 143.0 48334 2.3696
2.0474 144.0 48672 2.3669
1.9858 145.0 49010 2.3652
1.9858 146.0 49348 2.3631
2.0216 147.0 49686 2.3609
1.9961 148.0 50024 2.3588
1.9961 149.0 50362 2.3573
1.9873 150.0 50700 2.3554
2.0043 151.0 51038 2.3530
2.0043 152.0 51376 2.3508
2.0045 153.0 51714 2.3490
1.9951 154.0 52052 2.3475
1.9951 155.0 52390 2.3458
2.02 156.0 52728 2.3448
1.9924 157.0 53066 2.3429
1.9924 158.0 53404 2.3410
1.9757 159.0 53742 2.3398
1.9882 160.0 54080 2.3383
1.9882 161.0 54418 2.3368
2.0006 162.0 54756 2.3355
1.9984 163.0 55094 2.3341
1.9984 164.0 55432 2.3331
1.9823 165.0 55770 2.3318
1.9548 166.0 56108 2.3309
1.9548 167.0 56446 2.3297
1.9812 168.0 56784 2.3288
1.9793 169.0 57122 2.3276
1.9793 170.0 57460 2.3264
2.0022 171.0 57798 2.3255
1.9593 172.0 58136 2.3248
1.9593 173.0 58474 2.3236
1.9756 174.0 58812 2.3228
1.9835 175.0 59150 2.3221
1.9835 176.0 59488 2.3214
1.9655 177.0 59826 2.3208
1.9712 178.0 60164 2.3202
1.9658 179.0 60502 2.3195
1.9658 180.0 60840 2.3188
1.9501 181.0 61178 2.3185
1.992 182.0 61516 2.3180
1.992 183.0 61854 2.3176
1.9784 184.0 62192 2.3172
1.968 185.0 62530 2.3169
1.968 186.0 62868 2.3165
1.9746 187.0 63206 2.3161
1.9615 188.0 63544 2.3159
1.9615 189.0 63882 2.3157
1.9405 190.0 64220 2.3155
1.9869 191.0 64558 2.3153
1.9869 192.0 64896 2.3152
1.9614 193.0 65234 2.3150
1.9641 194.0 65572 2.3149
1.9641 195.0 65910 2.3148
1.9813 196.0 66248 2.3148
1.9676 197.0 66586 2.3147
1.9676 198.0 66924 2.3147
1.9302 199.0 67262 2.3147
1.99 200.0 67600 2.3147

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from