Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

mt5-base-honda

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5629
  • Rouge1: 43.6523
  • Rouge2: 32.203
  • Rougel: 43.3772
  • Rougelsum: 43.3022
  • Gen Len: 17.73

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 42 4.6175 8.5469 3.7013 8.3757 8.2968 24.2255
10.4753 1.99 84 2.0552 10.5551 4.9989 10.2056 10.1735 15.6706
3.9727 2.99 126 1.3455 17.4897 9.2722 17.0361 17.0862 22.9496
2.111 3.99 168 1.1659 25.8241 16.5472 24.746 24.758 20.2997
1.6646 4.99 210 1.0557 26.4848 16.6274 25.1504 25.1065 16.5816
1.4307 5.98 252 0.9476 27.9191 19.2121 26.8901 26.8025 17.4866
1.4307 6.98 294 0.8371 31.0402 21.0973 30.2516 30.1885 19.4837
1.2842 8.0 337 0.7391 30.1165 20.5251 29.578 29.4895 15.362
1.1333 9.0 379 0.7287 34.5585 25.243 34.0959 33.8699 14.1573
1.1795 9.99 421 0.8753 26.8989 18.8627 26.5091 26.3887 36.1039
1.3298 10.99 463 0.7194 32.1116 24.2333 31.8284 31.7729 28.5015
1.0536 11.99 505 0.6241 35.7743 27.7948 35.5473 35.4622 26.7211
1.0536 12.99 547 0.6308 37.2689 28.103 36.8811 36.8093 21.8665
0.888 13.98 589 0.6370 38.8088 29.6802 38.4384 38.2694 20.3501
0.8153 14.98 631 0.6071 37.9373 29.6887 37.652 37.4779 22.8902
0.7717 16.0 674 0.5852 40.3825 29.9582 40.2962 40.1624 18.7389
0.734 17.0 716 0.5800 40.6092 30.1735 40.4011 40.3258 18.3442
0.6963 17.99 758 0.5797 39.7132 29.0489 39.5127 39.3232 21.0682
0.6574 18.99 800 0.5892 39.4966 29.6245 39.2659 39.1309 20.1068
0.6574 19.99 842 0.5715 40.7632 30.7816 40.3728 40.2779 18.0267
0.616 20.99 884 0.5648 41.988 31.7066 41.6728 41.6091 18.4955
0.5983 21.98 926 0.5699 42.1128 31.661 41.9032 41.7323 16.9466
0.5726 22.98 968 0.5636 41.4489 30.5531 41.1694 41.1125 19.6053
0.5577 24.0 1011 0.5603 43.0244 31.7556 42.7213 42.6249 16.9733
0.5405 25.0 1053 0.5715 41.9882 31.1594 41.7023 41.5209 18.1068
0.5405 25.99 1095 0.5587 42.7531 31.9549 42.4015 42.3466 17.5786
0.5355 26.99 1137 0.5702 42.0918 31.0895 41.6868 41.6741 17.27
0.5041 27.99 1179 0.5520 43.1863 32.0579 42.8749 42.7887 17.2433
0.5005 28.99 1221 0.5683 42.2837 31.0531 42.0168 41.9643 17.5668
0.5013 29.98 1263 0.5626 42.8554 31.6127 42.6408 42.5058 17.9318
0.4599 30.98 1305 0.5637 42.8309 31.2431 42.6061 42.4722 18.6617
0.4599 32.0 1348 0.5620 43.4879 32.0117 43.1924 43.1162 17.3709
0.4783 33.0 1390 0.5616 42.9605 31.3625 42.6897 42.5864 18.2522
0.4742 33.99 1432 0.5548 43.2898 31.6867 43.0984 42.9612 17.5015
0.4598 34.99 1474 0.5596 43.9791 32.2278 43.6851 43.5645 17.5757
0.4381 35.99 1516 0.5638 43.7052 32.2458 43.392 43.31 17.6261
0.4496 36.99 1558 0.5567 43.9806 32.2596 43.654 43.6041 17.8902
0.4464 37.98 1600 0.5615 43.7515 32.4353 43.5184 43.4019 17.4629
0.4464 38.98 1642 0.5625 43.7698 32.3006 43.4733 43.3957 17.5935
0.4431 39.88 1680 0.5629 43.6523 32.203 43.3772 43.3022 17.73

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
26
Safetensors
Model size
582M params
Tensor type
F32
·

Finetuned from