Edit model card

deed_summarization_mt5_version_1

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5863
  • Rouge1: 1.0138
  • Rouge2: 0.6875
  • Rougel: 1.0233
  • Rougelsum: 1.0941
  • Gen Len: 288.1509

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 5000
  • num_epochs: 25

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
25.5422 1.0 375 16.6467 0.6477 0.0 0.6383 0.6639 25.0
11.9626 2.0 750 13.5214 0.6633 0.0 0.6553 0.6745 38.5912
10.1002 3.0 1125 7.4294 0.7257 0.0 0.7163 0.7264 386.6164
2.8844 4.0 1500 2.8574 0.7257 0.0 0.7163 0.7264 499.0
4.1183 5.0 1875 9.6893 0.7257 0.0 0.7163 0.7264 499.0
1.4443 6.0 2250 2.4224 0.7257 0.0 0.7163 0.7264 466.7673
3.8512 7.0 2625 1.5813 0.7257 0.0 0.7163 0.7264 432.4717
8.6527 8.0 3000 1.4532 0.7257 0.0 0.7163 0.7264 480.6164
0.5302 9.0 3375 1.1597 0.7257 0.0 0.7163 0.7264 419.239
1.2311 10.0 3750 0.9806 0.9895 0.1006 0.9135 0.9189 383.6855
0.8903 11.0 4125 0.8961 0.9609 0.1578 0.8871 0.8934 376.6038
0.8742 12.0 4500 0.8109 1.1104 0.2243 1.0038 1.007 388.3648
0.5934 13.0 4875 0.7588 0.3145 0.2287 0.3145 0.3145 341.717
0.1715 14.0 5250 0.7073 0.2795 0.2013 0.2795 0.2795 333.434
0.4363 15.0 5625 0.6780 0.4368 0.2287 0.4368 0.4368 326.7044
1.0736 16.0 6000 0.6647 0.7163 0.5169 0.7512 0.7512 299.4151
0.1069 17.0 6375 0.6294 0.856 0.6038 0.863 0.8595 312.434
0.1434 18.0 6750 0.6358 0.7512 0.5222 0.7862 0.808 291.4403
0.4344 19.0 7125 0.6164 1.1082 0.7576 1.1305 1.1574 304.7484
0.1038 20.0 7500 0.6066 0.8572 0.6108 0.8758 0.9085 297.3145
0.5519 21.0 7875 0.5972 0.4354 0.2935 0.5382 0.5382 281.5786
0.0804 22.0 8250 0.5994 0.6464 0.5583 0.7741 0.7794 305.805
0.3696 23.0 8625 0.5884 0.6362 0.3246 0.6362 0.6362 291.434
0.3966 24.0 9000 0.5852 0.7133 0.408 0.7311 0.8082 281.9119
0.3484 25.0 9375 0.5863 1.0138 0.6875 1.0233 1.0941 288.1509

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.1.0.dev20230811+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
0
Safetensors
Model size
582M params
Tensor type
F32
·

Finetuned from