Edit model card

mt5-small-finetuned-19jan-7

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.6123
  • Rouge1: 6.8298
  • Rouge2: 0.1667
  • Rougel: 6.5947
  • Rougelsum: 6.6685

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 12
  • eval_batch_size: 12
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
16.2953 1.0 50 5.4420 2.3065 0.0 2.3217 2.3089
10.6895 2.0 100 4.4691 3.2975 0.3693 3.2976 3.3376
7.0377 3.0 150 3.2638 4.1896 0.3485 4.1487 4.1878
5.7221 4.0 200 3.0772 6.2012 0.7955 6.1846 6.3083
4.9356 5.0 250 3.0312 5.2032 0.8545 5.1829 5.2263
4.4656 6.0 300 3.0022 5.6901 1.3505 5.6184 5.6791
4.2279 7.0 350 2.9585 5.6907 1.5424 5.644 5.7768
4.0578 8.0 400 2.9098 5.7425 1.0202 5.6452 5.7881
3.9236 9.0 450 2.8686 6.2001 1.1793 6.1891 6.2508
3.8237 10.0 500 2.8222 5.9182 1.1793 5.8436 5.9807
3.7078 11.0 550 2.7890 5.4733 1.3896 5.3702 5.4957
3.641 12.0 600 2.7522 5.8312 1.1793 5.784 5.9037
3.5527 13.0 650 2.7168 6.3129 1.1793 6.2924 6.384
3.5281 14.0 700 2.7000 9.1787 0.8333 9.1491 9.2241
3.4547 15.0 750 2.6966 7.8778 0.3333 7.8306 7.9167
3.4386 16.0 800 2.6892 8.3907 0.3333 8.3167 8.4
3.3749 17.0 850 2.6786 8.6167 0.4167 8.5917 8.5787
3.3681 18.0 900 2.6895 8.2466 0.4167 8.1799 8.2407
3.3173 19.0 950 2.6957 8.1742 0.4167 8.1197 8.1429
3.3034 20.0 1000 2.6721 8.2466 0.4167 8.1799 8.2407
3.2594 21.0 1050 2.6698 8.569 0.4167 8.5419 8.619
3.2138 22.0 1100 2.6676 8.2722 0.4167 8.2343 8.3037
3.2239 23.0 1150 2.6537 8.1444 0.4167 8.1051 8.1301
3.1887 24.0 1200 2.6529 8.1444 0.4167 8.1051 8.1301
3.1641 25.0 1250 2.6685 7.7777 0.1667 7.7204 7.8143
3.162 26.0 1300 2.6619 8.3776 0.3333 8.4135 8.4692
3.1114 27.0 1350 2.6632 8.3776 0.3333 8.4135 8.4692
3.0645 28.0 1400 2.6438 7.8811 0.3333 7.8333 7.9484
3.0984 29.0 1450 2.6384 7.3936 0.1667 7.3609 7.4051
3.0712 30.0 1500 2.6389 6.9609 0.1667 6.875 7.0253
3.0662 31.0 1550 2.6346 7.95 0.1667 7.9051 8.0218
3.0294 32.0 1600 2.6420 7.3936 0.1667 7.3609 7.4051
3.0143 33.0 1650 2.6325 7.6526 0.1667 7.6869 7.7551
3.002 34.0 1700 2.6384 7.9436 0.1667 7.9317 8.016
2.9964 35.0 1750 2.6262 8.2958 0.4167 8.2317 8.3936
2.9893 36.0 1800 2.6351 8.6535 0.1667 8.616 8.7333
2.9862 37.0 1850 2.6320 8.2452 0.1667 8.2 8.3218
2.9588 38.0 1900 2.6214 7.6656 0.1667 7.6819 7.7
2.9697 39.0 1950 2.6229 7.1452 0.1667 7.1051 7.1942
2.9433 40.0 2000 2.6209 7.5775 0.4167 7.4893 7.5833
2.9306 41.0 2050 2.6197 7.525 0.4167 7.4435 7.5351
2.9382 42.0 2100 2.6190 7.525 0.4167 7.4435 7.5351
2.9269 43.0 2150 2.6234 7.3614 0.4167 7.2092 7.3592
2.9152 44.0 2200 2.6237 6.9976 0.1667 6.8777 7.0333
2.9137 45.0 2250 2.6213 6.9976 0.1667 6.8777 7.0333
2.9011 46.0 2300 2.6212 6.9976 0.1667 6.8777 7.0333
2.8941 47.0 2350 2.6188 6.7768 0.1667 6.6509 6.812
2.9143 48.0 2400 2.6126 7.0875 0.1667 6.803 6.9337
2.8798 49.0 2450 2.6207 6.4458 0.1667 6.3221 6.4527
2.8701 50.0 2500 2.6172 6.7542 0.1667 6.4857 6.5729
2.8823 51.0 2550 2.6161 6.9971 0.1667 6.6819 6.7968
2.8724 52.0 2600 2.6171 6.8298 0.1667 6.5947 6.6685
2.8635 53.0 2650 2.6176 6.8298 0.1667 6.5947 6.6685
2.8803 54.0 2700 2.6134 6.1417 0.1667 5.929 6.0423
2.8608 55.0 2750 2.6118 6.4953 0.1667 6.2113 6.3554
2.8655 56.0 2800 2.6125 6.4976 0.1667 6.2625 6.3539
2.856 57.0 2850 2.6136 6.8298 0.1667 6.5947 6.6685
2.8837 58.0 2900 2.6124 6.8298 0.1667 6.5947 6.6685
2.8871 59.0 2950 2.6123 6.8298 0.1667 6.5947 6.6685
2.8537 60.0 3000 2.6123 6.8298 0.1667 6.5947 6.6685

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.8.0
  • Tokenizers 0.13.2
Downloads last month
11