Edit model card

long-t5-local-base-finetuned-justification-v06

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8645

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss
5.625 1.0 676 2.2487
1.8303 2.0 1352 1.9383
1.4604 3.0 2028 1.7984
1.4095 4.0 2704 1.6848
1.344 5.0 3380 1.6151
1.2566 6.0 4056 1.5658
1.2083 7.0 4732 1.5206
1.1799 8.0 5408 1.4763
1.1419 9.0 6084 1.4418
1.0928 10.0 6760 1.4131
1.134 11.0 7436 1.3941
1.0297 12.0 8112 1.3594
1.0153 13.0 8788 1.3456
1.0246 14.0 9464 1.3439
0.9392 15.0 10140 1.3199
0.9589 16.0 10816 1.3139
0.9286 17.0 11492 1.3046
0.8812 18.0 12168 1.2886
0.9437 19.0 12844 1.2862
0.8756 20.0 13520 1.2817
0.8738 21.0 14196 1.2762
0.8467 22.0 14872 1.2668
0.8306 23.0 15548 1.2623
0.8471 24.0 16224 1.2637
0.8003 25.0 16900 1.2530
0.8201 26.0 17576 1.2478
0.7935 27.0 18252 1.2573
0.7493 28.0 18928 1.2488
0.772 29.0 19604 1.2480
0.7537 30.0 20280 1.2558
0.7466 31.0 20956 1.2511
0.7481 32.0 21632 1.2561
0.7016 33.0 22308 1.2619
0.7067 34.0 22984 1.2557
0.7206 35.0 23660 1.2493
0.6842 36.0 24336 1.2528
0.6835 37.0 25012 1.2626
0.6799 38.0 25688 1.2605
0.6293 39.0 26364 1.2746
0.6269 40.0 27040 1.2725
0.6341 41.0 27716 1.2671
0.6193 42.0 28392 1.2739
0.6434 43.0 29068 1.2784
0.6 44.0 29744 1.2846
0.5844 45.0 30420 1.3010
0.5801 46.0 31096 1.2964
0.5803 47.0 31772 1.2938
0.5755 48.0 32448 1.2986
0.5703 49.0 33124 1.3067
0.566 50.0 33800 1.2990
0.5356 51.0 34476 1.3021
0.5331 52.0 35152 1.2996
0.5657 53.0 35828 1.3225
0.5195 54.0 36504 1.3237
0.5199 55.0 37180 1.3253
0.5225 56.0 37856 1.3275
0.4995 57.0 38532 1.3347
0.4991 58.0 39208 1.3356
0.4848 59.0 39884 1.3534
0.4731 60.0 40560 1.3557
0.4526 61.0 41236 1.3445
0.47 62.0 41912 1.3547
0.453 63.0 42588 1.3588
0.4508 64.0 43264 1.3694
0.4316 65.0 43940 1.3753
0.4386 66.0 44616 1.3804
0.4243 67.0 45292 1.3797
0.4188 68.0 45968 1.3833
0.4132 69.0 46644 1.3980
0.4244 70.0 47320 1.3960
0.3925 71.0 47996 1.4038
0.3919 72.0 48672 1.4228
0.3933 73.0 49348 1.4173
0.394 74.0 50024 1.4243
0.3916 75.0 50700 1.4224
0.3745 76.0 51376 1.4274
0.3708 77.0 52052 1.4296
0.3667 78.0 52728 1.4342
0.356 79.0 53404 1.4478
0.3546 80.0 54080 1.4431
0.353 81.0 54756 1.4546
0.3473 82.0 55432 1.4520
0.3442 83.0 56108 1.4526
0.3388 84.0 56784 1.4758
0.3044 85.0 57460 1.4715
0.3268 86.0 58136 1.4972
0.3185 87.0 58812 1.4889
0.3092 88.0 59488 1.4899
0.3044 89.0 60164 1.5039
0.3055 90.0 60840 1.4887
0.3014 91.0 61516 1.5114
0.2955 92.0 62192 1.5135
0.2912 93.0 62868 1.5246
0.2969 94.0 63544 1.5319
0.2787 95.0 64220 1.5254
0.288 96.0 64896 1.5278
0.2629 97.0 65572 1.5385
0.2717 98.0 66248 1.5580
0.2728 99.0 66924 1.5702
0.2616 100.0 67600 1.5533
0.2675 101.0 68276 1.5613
0.2477 102.0 68952 1.5610
0.2553 103.0 69628 1.5691
0.2533 104.0 70304 1.5667
0.2557 105.0 70980 1.6058
0.25 106.0 71656 1.5902
0.2386 107.0 72332 1.6042
0.2351 108.0 73008 1.6072
0.2407 109.0 73684 1.6102
0.2297 110.0 74360 1.6164
0.2363 111.0 75036 1.6082
0.2316 112.0 75712 1.6259
0.2313 113.0 76388 1.6233
0.2338 114.0 77064 1.6290
0.204 115.0 77740 1.6390
0.2353 116.0 78416 1.6311
0.2086 117.0 79092 1.6315
0.2052 118.0 79768 1.6337
0.2239 119.0 80444 1.6419
0.2056 120.0 81120 1.6438
0.2037 121.0 81796 1.6513
0.2059 122.0 82472 1.6664
0.1981 123.0 83148 1.6607
0.1918 124.0 83824 1.6807
0.1894 125.0 84500 1.6726
0.1897 126.0 85176 1.6886
0.1887 127.0 85852 1.6848
0.192 128.0 86528 1.6893
0.1961 129.0 87204 1.6995
0.1772 130.0 87880 1.6966
0.189 131.0 88556 1.7023
0.1759 132.0 89232 1.7025
0.1856 133.0 89908 1.7151
0.1792 134.0 90584 1.7162
0.1767 135.0 91260 1.7129
0.1606 136.0 91936 1.7348
0.1788 137.0 92612 1.7216
0.1608 138.0 93288 1.7401
0.1769 139.0 93964 1.7486
0.162 140.0 94640 1.7506
0.1572 141.0 95316 1.7338
0.1585 142.0 95992 1.7441
0.1638 143.0 96668 1.7660
0.172 144.0 97344 1.7548
0.1638 145.0 98020 1.7673
0.1515 146.0 98696 1.7623
0.1713 147.0 99372 1.7516
0.1434 148.0 100048 1.7782
0.1578 149.0 100724 1.7881
0.1473 150.0 101400 1.7795
0.1491 151.0 102076 1.7861
0.1573 152.0 102752 1.7901
0.1472 153.0 103428 1.7969
0.1469 154.0 104104 1.8035
0.1539 155.0 104780 1.7849
0.1473 156.0 105456 1.7996
0.1414 157.0 106132 1.7976
0.1592 158.0 106808 1.8022
0.1284 159.0 107484 1.7968
0.1373 160.0 108160 1.8150
0.1446 161.0 108836 1.8154
0.1382 162.0 109512 1.8139
0.147 163.0 110188 1.8228
0.1385 164.0 110864 1.8222
0.1288 165.0 111540 1.8229
0.1368 166.0 112216 1.8266
0.1343 167.0 112892 1.8313
0.1346 168.0 113568 1.8199
0.1389 169.0 114244 1.8277
0.1432 170.0 114920 1.8330
0.1268 171.0 115596 1.8358
0.1309 172.0 116272 1.8416
0.1344 173.0 116948 1.8289
0.1338 174.0 117624 1.8418
0.1315 175.0 118300 1.8325
0.1245 176.0 118976 1.8351
0.1305 177.0 119652 1.8503
0.1254 178.0 120328 1.8431
0.1223 179.0 121004 1.8506
0.1234 180.0 121680 1.8480
0.1223 181.0 122356 1.8435
0.1304 182.0 123032 1.8530
0.121 183.0 123708 1.8480
0.1284 184.0 124384 1.8550
0.1339 185.0 125060 1.8578
0.1353 186.0 125736 1.8476
0.1219 187.0 126412 1.8550
0.117 188.0 127088 1.8606
0.1269 189.0 127764 1.8588
0.1118 190.0 128440 1.8564
0.1226 191.0 129116 1.8682
0.1284 192.0 129792 1.8582
0.1125 193.0 130468 1.8603
0.1227 194.0 131144 1.8660
0.1373 195.0 131820 1.8660
0.1122 196.0 132496 1.8647
0.1282 197.0 133172 1.8632
0.1199 198.0 133848 1.8625
0.1281 199.0 134524 1.8640
0.1274 200.0 135200 1.8645

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from