Edit model card

long-t5-local-base-finetuned-justification-v04

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1250

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-07
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
28.8458 1.0 676 22.6997
27.4586 2.0 1352 20.9991
24.6852 3.0 2028 19.3376
23.1391 4.0 2704 17.6632
21.9699 5.0 3380 15.9812
19.2778 6.0 4056 14.2373
17.8249 7.0 4732 12.4497
16.3797 8.0 5408 10.5990
13.5688 9.0 6084 8.7096
11.8803 10.0 6760 7.3735
10.6635 11.0 7436 6.9285
7.8818 12.0 8112 6.7818
7.0876 13.0 8788 6.7247
6.4869 14.0 9464 6.6927
5.4921 15.0 10140 6.6090
5.4028 16.0 10816 6.4377
5.1724 17.0 11492 6.1746
4.7436 18.0 12168 5.9188
4.8811 19.0 12844 5.6474
4.5173 20.0 13520 5.3848
4.463 21.0 14196 5.1572
4.2793 22.0 14872 4.9559
4.1378 23.0 15548 4.7922
4.1071 24.0 16224 4.6584
3.9575 25.0 16900 4.5451
3.913 26.0 17576 4.4492
3.8237 27.0 18252 4.3749
3.6779 28.0 18928 4.3044
3.7032 29.0 19604 4.2445
3.6205 30.0 20280 4.1905
3.5488 31.0 20956 4.1418
3.5511 32.0 21632 4.0970
3.4027 33.0 22308 4.0561
3.4178 34.0 22984 4.0172
3.4568 35.0 23660 3.9798
3.3257 36.0 24336 3.9451
3.3202 37.0 25012 3.9120
3.3266 38.0 25688 3.8796
3.1662 39.0 26364 3.8489
3.1681 40.0 27040 3.8190
3.194 41.0 27716 3.7912
3.138 42.0 28392 3.7628
3.2074 43.0 29068 3.7340
3.1001 44.0 29744 3.7068
3.0341 45.0 30420 3.6814
3.0394 46.0 31096 3.6553
3.0737 47.0 31772 3.6293
3.0273 48.0 32448 3.6053
3.0223 49.0 33124 3.5809
3.04 50.0 33800 3.5563
2.9182 51.0 34476 3.5335
2.9424 52.0 35152 3.5120
3.0383 53.0 35828 3.4895
2.9066 54.0 36504 3.4680
2.9488 55.0 37180 3.4477
2.9238 56.0 37856 3.4284
2.8921 57.0 38532 3.4095
2.9047 58.0 39208 3.3916
2.8593 59.0 39884 3.3749
2.8443 60.0 40560 3.3587
2.7948 61.0 41236 3.3435
2.8484 62.0 41912 3.3290
2.8094 63.0 42588 3.3143
2.7928 64.0 43264 3.3019
2.7722 65.0 43940 3.2888
2.7996 66.0 44616 3.2757
2.7521 67.0 45292 3.2646
2.7506 68.0 45968 3.2536
2.7454 69.0 46644 3.2433
2.782 70.0 47320 3.2333
2.733 71.0 47996 3.2244
2.7219 72.0 48672 3.2158
2.7289 73.0 49348 3.2075
2.7365 74.0 50024 3.2001
2.7487 75.0 50700 3.1934
2.6969 76.0 51376 3.1863
2.7079 77.0 52052 3.1803
2.717 78.0 52728 3.1741
2.7059 79.0 53404 3.1690
2.681 80.0 54080 3.1639
2.7309 81.0 54756 3.1592
2.6887 82.0 55432 3.1546
2.7021 83.0 56108 3.1506
2.7144 84.0 56784 3.1474
2.6032 85.0 57460 3.1443
2.6943 86.0 58136 3.1411
2.6888 87.0 58812 3.1382
2.6167 88.0 59488 3.1356
2.6672 89.0 60164 3.1333
2.6447 90.0 60840 3.1315
2.668 91.0 61516 3.1300
2.6378 92.0 62192 3.1287
2.7002 93.0 62868 3.1277
2.6958 94.0 63544 3.1269
2.6296 95.0 64220 3.1262
2.6784 96.0 64896 3.1257
2.6044 97.0 65572 3.1253
2.6682 98.0 66248 3.1251
2.6628 99.0 66924 3.1250
2.6305 100.0 67600 3.1250

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from