Edit model card

long-t5-local-base-finetuned-justification-v05

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3822

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
4.7549 1.0 676 2.2649
1.8473 2.0 1352 1.9235
1.4521 3.0 2028 1.7887
1.4025 4.0 2704 1.6854
1.3373 5.0 3380 1.6227
1.2577 6.0 4056 1.5758
1.2089 7.0 4732 1.5332
1.1828 8.0 5408 1.4992
1.152 9.0 6084 1.4637
1.1035 10.0 6760 1.4390
1.1482 11.0 7436 1.4142
1.0386 12.0 8112 1.3913
1.0287 13.0 8788 1.3702
1.0355 14.0 9464 1.3613
0.9495 15.0 10140 1.3458
0.9726 16.0 10816 1.3346
0.9417 17.0 11492 1.3223
0.8965 18.0 12168 1.3155
0.9588 19.0 12844 1.3123
0.8909 20.0 13520 1.3079
0.8935 21.0 14196 1.2922
0.8629 22.0 14872 1.2886
0.8533 23.0 15548 1.2938
0.8676 24.0 16224 1.2867
0.8245 25.0 16900 1.2804
0.8449 26.0 17576 1.2781
0.817 27.0 18252 1.2816
0.7731 28.0 18928 1.2768
0.8 29.0 19604 1.2732
0.785 30.0 20280 1.2785
0.7751 31.0 20956 1.2790
0.7829 32.0 21632 1.2817
0.7334 33.0 22308 1.2749
0.7409 34.0 22984 1.2653
0.761 35.0 23660 1.2711
0.7175 36.0 24336 1.2744
0.7244 37.0 25012 1.2728
0.7229 38.0 25688 1.2758
0.6733 39.0 26364 1.2783
0.6748 40.0 27040 1.2858
0.6798 41.0 27716 1.2810
0.6682 42.0 28392 1.2809
0.6945 43.0 29068 1.2900
0.6563 44.0 29744 1.2896
0.6362 45.0 30420 1.2918
0.6391 46.0 31096 1.2923
0.6379 47.0 31772 1.2853
0.6384 48.0 32448 1.2888
0.6341 49.0 33124 1.2932
0.6344 50.0 33800 1.2958
0.6012 51.0 34476 1.2995
0.6029 52.0 35152 1.3007
0.6376 53.0 35828 1.3127
0.5931 54.0 36504 1.3073
0.5984 55.0 37180 1.3092
0.6052 56.0 37856 1.3134
0.5843 57.0 38532 1.3158
0.5877 58.0 39208 1.3204
0.5708 59.0 39884 1.3335
0.566 60.0 40560 1.3294
0.542 61.0 41236 1.3305
0.5594 62.0 41912 1.3248
0.5461 63.0 42588 1.3323
0.5498 64.0 43264 1.3336
0.5302 65.0 43940 1.3314
0.5418 66.0 44616 1.3297
0.5319 67.0 45292 1.3298
0.5241 68.0 45968 1.3440
0.5275 69.0 46644 1.3418
0.5427 70.0 47320 1.3493
0.5078 71.0 47996 1.3477
0.5075 72.0 48672 1.3510
0.5116 73.0 49348 1.3460
0.521 74.0 50024 1.3519
0.5191 75.0 50700 1.3550
0.5053 76.0 51376 1.3573
0.5063 77.0 52052 1.3576
0.5057 78.0 52728 1.3599
0.4916 79.0 53404 1.3657
0.4946 80.0 54080 1.3604
0.502 81.0 54756 1.3656
0.4915 82.0 55432 1.3640
0.4926 83.0 56108 1.3659
0.4935 84.0 56784 1.3665
0.4566 85.0 57460 1.3685
0.4876 86.0 58136 1.3747
0.4822 87.0 58812 1.3711
0.4656 88.0 59488 1.3752
0.4705 89.0 60164 1.3727
0.477 90.0 60840 1.3736
0.4752 91.0 61516 1.3772
0.4712 92.0 62192 1.3778
0.4753 93.0 62868 1.3774
0.4827 94.0 63544 1.3821
0.4605 95.0 64220 1.3784
0.4757 96.0 64896 1.3835
0.4482 97.0 65572 1.3819
0.467 98.0 66248 1.3800
0.4684 99.0 66924 1.3816
0.4613 100.0 67600 1.3822

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
18
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from