Edit model card

mT5-TextSimp-LT-BatchSize4-lr5e-5

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1611
  • Rouge1: 0.46
  • Rouge2: 0.2767
  • Rougel: 0.4464
  • Sacrebleu: 23.2936
  • Gen Len: 39.0358

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Sacrebleu Gen Len
36.6446 0.48 200 31.2765 0.0004 0.0 0.0004 0.0003 512.0
11.5223 0.96 400 6.7786 0.0031 0.0 0.0031 0.0045 89.2816
2.2686 1.44 600 0.6729 0.0054 0.0 0.0053 0.0196 39.0501
0.7009 1.91 800 0.6529 0.0029 0.0 0.0027 0.0424 41.401
0.6213 2.39 1000 0.5630 0.0058 0.0002 0.0056 0.0201 39.0334
0.6435 2.87 1200 0.4697 0.0688 0.0084 0.0608 0.1156 39.0453
0.4154 3.35 1400 10.4655 0.2098 0.1219 0.2011 0.671 350.0334
0.6289 3.83 1600 1.9257 0.3176 0.1945 0.3072 3.6031 138.7494
3.5542 4.31 1800 0.8459 0.373 0.2029 0.3615 16.8305 59.8568
8.1736 4.78 2000 7.2350 0.3147 0.1815 0.3033 7.3572 289.1432
2.3987 5.26 2200 0.8361 0.3616 0.1903 0.3501 16.2229 61.0668
0.9853 5.74 2400 0.4219 0.3635 0.2004 0.3515 15.2744 46.494
0.3575 6.22 2600 0.3516 0.3796 0.2121 0.3687 13.6464 46.1623
0.4497 6.7 2800 0.2597 0.4392 0.2698 0.4263 18.9423 42.2697
0.2582 7.18 3000 0.1583 0.4442 0.2579 0.431 21.5533 38.1671
0.2629 7.66 3200 0.1611 0.46 0.2767 0.4464 23.2936 39.0358

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
25

Finetuned from