Edit model card

long-t5-local-base-ARv1

This model is a fine-tuned version of google/long-t5-local-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9303
  • Exact Match: 18.0
  • Gen Len: 3.38

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60

Training results

Training Loss Epoch Step Validation Loss Exact Match Gen Len
No log 1.0 7 3.4004 14.0 3.86
2.7206 2.0 14 3.1925 8.0 3.66
2.6501 3.0 21 2.9867 8.0 3.7
2.6501 4.0 28 2.8576 12.0 4.58
1.9849 5.0 35 2.9078 12.0 4.52
2.0193 6.0 42 2.8173 8.0 3.84
2.0193 7.0 49 2.7735 16.0 3.42
1.6108 8.0 56 2.5993 12.0 3.82
1.8323 9.0 63 2.5879 12.0 3.92
1.4861 10.0 70 2.7203 16.0 3.4
1.4861 11.0 77 2.9902 24.0 3.1
1.425 12.0 84 2.7667 14.0 3.36
1.0387 13.0 91 2.6547 18.0 3.42
1.0387 14.0 98 2.7072 18.0 3.34
1.0793 15.0 105 2.8158 12.0 3.58
1.1969 16.0 112 2.9404 14.0 3.32
1.1969 17.0 119 2.8512 14.0 3.3
1.15 18.0 126 2.7513 18.0 3.68
1.2024 19.0 133 2.7124 16.0 3.48
1.3331 20.0 140 2.7484 16.0 3.4
1.3331 21.0 147 2.8289 18.0 3.44
1.1469 22.0 154 2.9873 14.0 3.36
1.5639 23.0 161 3.0321 18.0 3.4
1.5639 24.0 168 3.0117 14.0 3.3
0.8542 25.0 175 2.8331 16.0 3.34
0.9789 26.0 182 2.7876 20.0 3.36
0.9789 27.0 189 2.7820 20.0 3.36
0.8853 28.0 196 2.8082 18.0 3.38
0.9126 29.0 203 2.8316 16.0 3.36
1.0543 30.0 210 2.8449 18.0 3.64
1.0543 31.0 217 2.8034 8.0 3.62
1.0683 32.0 224 2.8115 14.0 3.46
0.951 33.0 231 2.9019 18.0 3.34
0.951 34.0 238 3.0115 18.0 3.24
0.8315 35.0 245 3.0392 18.0 3.24
1.1548 36.0 252 3.0643 18.0 3.36
1.1548 37.0 259 3.0031 16.0 3.42
0.7813 38.0 266 2.9801 18.0 3.48
0.671 39.0 273 2.9622 18.0 3.48
1.1771 40.0 280 2.9049 18.0 3.46
1.1771 41.0 287 2.9042 20.0 3.56
0.5959 42.0 294 2.9598 18.0 3.48
1.1583 43.0 301 2.9936 18.0 3.44
1.1583 44.0 308 3.0072 18.0 3.44
0.5728 45.0 315 3.0003 18.0 3.44
0.7237 46.0 322 3.0093 16.0 3.4
0.7237 47.0 329 2.9688 18.0 3.42
0.7295 48.0 336 2.9533 18.0 3.38
0.5627 49.0 343 2.9357 18.0 3.36
0.6489 50.0 350 2.9317 18.0 3.4
0.6489 51.0 357 2.9339 18.0 3.4
1.0427 52.0 364 2.9256 18.0 3.4
0.9156 53.0 371 2.9220 18.0 3.4
0.9156 54.0 378 2.9091 18.0 3.38
0.4748 55.0 385 2.9036 18.0 3.36
0.5616 56.0 392 2.8998 18.0 3.36
0.5616 57.0 399 2.9128 18.0 3.36
0.4836 58.0 406 2.9205 18.0 3.36
0.6498 59.0 413 2.9282 18.0 3.36
0.615 60.0 420 2.9303 18.0 3.38

Framework versions

  • Transformers 4.41.0
  • Pytorch 2.2.1
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from