Edit model card

mt5-small-finetuned-DEPlain

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6826
  • Rouge1: 55.2811
  • Rouge2: 33.4022
  • Rougel: 49.0555
  • Rougelsum: 49.8535
  • Gen Len: 16.3063

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.1176 1.0 1333 2.2908 44.5349 24.6346 39.4206 40.1782 14.1202
2.9848 2.0 2666 2.0502 50.238 29.4215 44.3707 45.2709 15.4249
2.7551 3.0 3999 1.9383 52.4619 30.9025 46.0413 47.0121 15.814
2.5148 4.0 5332 1.8737 53.5574 31.7729 47.2411 48.1681 15.9764
2.3973 5.0 6665 1.8404 54.3931 32.4048 47.8747 48.7838 16.1422
2.3364 6.0 7998 1.7939 54.1979 32.253 47.8117 48.6785 16.0699
2.2807 7.0 9331 1.7804 54.3636 32.486 48.0067 48.8565 16.1324
2.2326 8.0 10664 1.7678 54.7743 32.7932 48.3657 49.1933 16.1608
2.1839 9.0 11997 1.7436 54.8052 32.8554 48.4859 49.3042 16.2047
2.1508 10.0 13330 1.7309 54.8064 32.8648 48.4888 49.3149 16.1779
2.1245 11.0 14663 1.7251 55.0598 33.1609 48.7331 49.6079 16.2705
2.1003 12.0 15996 1.7104 54.9449 33.1058 48.7477 49.5681 16.2185
2.0486 13.0 17329 1.6998 55.2225 33.3383 48.9821 49.8075 16.2494
2.0494 14.0 18662 1.6966 55.1758 33.3602 48.9134 49.7473 16.2648
2.0307 15.0 19995 1.6912 55.2276 33.3542 49.0322 49.8302 16.2721
2.0296 16.0 21328 1.6845 55.153 33.289 48.8609 49.7004 16.2754
2.01 17.0 22661 1.6842 55.3664 33.4755 49.1395 49.9518 16.3168
1.989 18.0 23994 1.6836 55.2333 33.3763 49.0259 49.8265 16.2794
2.0067 19.0 25327 1.6829 55.3122 33.4023 49.1034 49.8989 16.3022
2.0174 20.0 26660 1.6826 55.2811 33.4022 49.0555 49.8535 16.3063

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
300M params
Tensor type
F32
·

Finetuned from