Edit model card

mT5-TextSimp-LT-BatchSize4-lr1e-4

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0737
  • Rouge1: 0.7174
  • Rouge2: 0.5553
  • Rougel: 0.7108
  • Sacrebleu: 43.3127
  • Gen Len: 38.0501

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Sacrebleu Gen Len
25.4634 0.48 200 18.5157 0.0061 0.0 0.0059 0.0008 512.0
1.1451 0.96 400 0.6596 0.0161 0.0003 0.0154 0.0215 39.0453
0.6441 1.44 600 0.4981 0.0272 0.0012 0.0259 0.0166 39.0453
0.247 1.91 800 0.1420 0.4769 0.2826 0.465 20.3212 38.0501
0.1549 2.39 1000 0.1032 0.6114 0.4299 0.5998 30.2603 38.0501
0.1482 2.87 1200 0.0934 0.6592 0.4815 0.6496 34.4213 38.0501
0.1163 3.35 1400 0.0867 0.6734 0.4968 0.6651 36.3741 38.0501
0.1042 3.83 1600 0.0816 0.6826 0.5127 0.6753 38.128 38.0501
0.1109 4.31 1800 0.0816 0.6893 0.5191 0.6818 39.3294 38.0501
0.1029 4.78 2000 0.0798 0.6968 0.5284 0.6901 40.5064 38.0501
0.0877 5.26 2200 0.0766 0.7006 0.5372 0.694 40.5295 38.0501
0.0748 5.74 2400 0.0759 0.7092 0.5403 0.7028 41.4424 38.0501
0.0941 6.22 2600 0.0754 0.7134 0.5471 0.7066 42.4212 38.0501
0.1095 6.7 2800 0.0737 0.7198 0.5547 0.7135 42.8225 38.0501
0.0749 7.18 3000 0.0735 0.7165 0.5536 0.7107 42.9748 38.0501
0.073 7.66 3200 0.0737 0.7174 0.5553 0.7108 43.3127 38.0501

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.4
  • Tokenizers 0.13.3
Downloads last month
0

Finetuned from