Edit model card

long-t5-local-base-finetuned-justification-v03

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0759

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-07
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
37.3451 1.0 676 30.7532
35.6146 2.0 1352 28.8261
32.3915 3.0 2028 26.9524
30.6984 4.0 2704 25.0794
29.1119 5.0 3380 23.2287
25.8833 6.0 4056 21.3800
24.6464 7.0 4732 19.5382
23.0238 8.0 5408 17.7077
20.1628 9.0 6084 15.7830
18.3835 10.0 6760 13.7916
17.0307 11.0 7436 11.7701
13.9093 12.0 8112 9.7176
12.3394 13.0 8788 7.7920
10.888 14.0 9464 6.6093
8.1438 15.0 10140 6.1657
7.1948 16.0 10816 5.9876
6.4311 17.0 11492 5.8574
5.2684 18.0 12168 5.7303
5.2309 19.0 12844 5.6090
4.6475 20.0 13520 5.4782
4.583 21.0 14196 5.3425
4.3645 22.0 14872 5.2021
4.1721 23.0 15548 5.0568
4.1423 24.0 16224 4.9156
3.983 25.0 16900 4.7832
3.9396 26.0 17576 4.6574
3.8342 27.0 18252 4.5455
3.651 28.0 18928 4.4371
3.663 29.0 19604 4.3453
3.5847 30.0 20280 4.2648
3.5013 31.0 20956 4.1942
3.5122 32.0 21632 4.1298
3.3473 33.0 22308 4.0700
3.3417 34.0 22984 4.0167
3.3881 35.0 23660 3.9700
3.2404 36.0 24336 3.9258
3.2232 37.0 25012 3.8830
3.2287 38.0 25688 3.8438
3.0759 39.0 26364 3.8066
3.053 40.0 27040 3.7711
3.0726 41.0 27716 3.7386
3.0198 42.0 28392 3.7072
3.0923 43.0 29068 3.6768
2.986 44.0 29744 3.6489
2.9184 45.0 30420 3.6221
2.9114 46.0 31096 3.5972
2.9585 47.0 31772 3.5736
2.91 48.0 32448 3.5477
2.8974 49.0 33124 3.5252
2.9211 50.0 33800 3.5020
2.785 51.0 34476 3.4795
2.8177 52.0 35152 3.4581
2.9204 53.0 35828 3.4392
2.7911 54.0 36504 3.4199
2.8178 55.0 37180 3.4013
2.8029 56.0 37856 3.3830
2.7654 57.0 38532 3.3655
2.7854 58.0 39208 3.3486
2.7322 59.0 39884 3.3314
2.722 60.0 40560 3.3161
2.6665 61.0 41236 3.3016
2.719 62.0 41912 3.2865
2.6758 63.0 42588 3.2720
2.6586 64.0 43264 3.2582
2.6443 65.0 43940 3.2451
2.6656 66.0 44616 3.2322
2.6183 67.0 45292 3.2206
2.6143 68.0 45968 3.2087
2.6117 69.0 46644 3.1989
2.6567 70.0 47320 3.1890
2.5946 71.0 47996 3.1797
2.5836 72.0 48672 3.1703
2.5928 73.0 49348 3.1627
2.6024 74.0 50024 3.1552
2.6117 75.0 50700 3.1469
2.5681 76.0 51376 3.1392
2.5731 77.0 52052 3.1329
2.5749 78.0 52728 3.1267
2.5788 79.0 53404 3.1208
2.5478 80.0 54080 3.1156
2.6048 81.0 54756 3.1104
2.5514 82.0 55432 3.1062
2.5705 83.0 56108 3.1027
2.5733 84.0 56784 3.0994
2.4704 85.0 57460 3.0965
2.558 86.0 58136 3.0928
2.5566 87.0 58812 3.0895
2.4822 88.0 59488 3.0869
2.5377 89.0 60164 3.0844
2.5173 90.0 60840 3.0826
2.5312 91.0 61516 3.0809
2.5038 92.0 62192 3.0799
2.5645 93.0 62868 3.0788
2.5612 94.0 63544 3.0778
2.4948 95.0 64220 3.0770
2.538 96.0 64896 3.0765
2.4701 97.0 65572 3.0762
2.5269 98.0 66248 3.0759
2.5265 99.0 66924 3.0759
2.4955 100.0 67600 3.0759

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from