metadata
license: apache-2.0
base_model: google/mt5-base
tags:
- generated_from_trainer
metrics:
- rouge
model-index:
- name: mt5-base-honda
results: []
mt5-base-honda
This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.5629
- Rouge1: 43.6523
- Rouge2: 32.203
- Rougel: 43.3772
- Rougelsum: 43.3022
- Gen Len: 17.73
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 40
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 42 | 4.6175 | 8.5469 | 3.7013 | 8.3757 | 8.2968 | 24.2255 |
10.4753 | 1.99 | 84 | 2.0552 | 10.5551 | 4.9989 | 10.2056 | 10.1735 | 15.6706 |
3.9727 | 2.99 | 126 | 1.3455 | 17.4897 | 9.2722 | 17.0361 | 17.0862 | 22.9496 |
2.111 | 3.99 | 168 | 1.1659 | 25.8241 | 16.5472 | 24.746 | 24.758 | 20.2997 |
1.6646 | 4.99 | 210 | 1.0557 | 26.4848 | 16.6274 | 25.1504 | 25.1065 | 16.5816 |
1.4307 | 5.98 | 252 | 0.9476 | 27.9191 | 19.2121 | 26.8901 | 26.8025 | 17.4866 |
1.4307 | 6.98 | 294 | 0.8371 | 31.0402 | 21.0973 | 30.2516 | 30.1885 | 19.4837 |
1.2842 | 8.0 | 337 | 0.7391 | 30.1165 | 20.5251 | 29.578 | 29.4895 | 15.362 |
1.1333 | 9.0 | 379 | 0.7287 | 34.5585 | 25.243 | 34.0959 | 33.8699 | 14.1573 |
1.1795 | 9.99 | 421 | 0.8753 | 26.8989 | 18.8627 | 26.5091 | 26.3887 | 36.1039 |
1.3298 | 10.99 | 463 | 0.7194 | 32.1116 | 24.2333 | 31.8284 | 31.7729 | 28.5015 |
1.0536 | 11.99 | 505 | 0.6241 | 35.7743 | 27.7948 | 35.5473 | 35.4622 | 26.7211 |
1.0536 | 12.99 | 547 | 0.6308 | 37.2689 | 28.103 | 36.8811 | 36.8093 | 21.8665 |
0.888 | 13.98 | 589 | 0.6370 | 38.8088 | 29.6802 | 38.4384 | 38.2694 | 20.3501 |
0.8153 | 14.98 | 631 | 0.6071 | 37.9373 | 29.6887 | 37.652 | 37.4779 | 22.8902 |
0.7717 | 16.0 | 674 | 0.5852 | 40.3825 | 29.9582 | 40.2962 | 40.1624 | 18.7389 |
0.734 | 17.0 | 716 | 0.5800 | 40.6092 | 30.1735 | 40.4011 | 40.3258 | 18.3442 |
0.6963 | 17.99 | 758 | 0.5797 | 39.7132 | 29.0489 | 39.5127 | 39.3232 | 21.0682 |
0.6574 | 18.99 | 800 | 0.5892 | 39.4966 | 29.6245 | 39.2659 | 39.1309 | 20.1068 |
0.6574 | 19.99 | 842 | 0.5715 | 40.7632 | 30.7816 | 40.3728 | 40.2779 | 18.0267 |
0.616 | 20.99 | 884 | 0.5648 | 41.988 | 31.7066 | 41.6728 | 41.6091 | 18.4955 |
0.5983 | 21.98 | 926 | 0.5699 | 42.1128 | 31.661 | 41.9032 | 41.7323 | 16.9466 |
0.5726 | 22.98 | 968 | 0.5636 | 41.4489 | 30.5531 | 41.1694 | 41.1125 | 19.6053 |
0.5577 | 24.0 | 1011 | 0.5603 | 43.0244 | 31.7556 | 42.7213 | 42.6249 | 16.9733 |
0.5405 | 25.0 | 1053 | 0.5715 | 41.9882 | 31.1594 | 41.7023 | 41.5209 | 18.1068 |
0.5405 | 25.99 | 1095 | 0.5587 | 42.7531 | 31.9549 | 42.4015 | 42.3466 | 17.5786 |
0.5355 | 26.99 | 1137 | 0.5702 | 42.0918 | 31.0895 | 41.6868 | 41.6741 | 17.27 |
0.5041 | 27.99 | 1179 | 0.5520 | 43.1863 | 32.0579 | 42.8749 | 42.7887 | 17.2433 |
0.5005 | 28.99 | 1221 | 0.5683 | 42.2837 | 31.0531 | 42.0168 | 41.9643 | 17.5668 |
0.5013 | 29.98 | 1263 | 0.5626 | 42.8554 | 31.6127 | 42.6408 | 42.5058 | 17.9318 |
0.4599 | 30.98 | 1305 | 0.5637 | 42.8309 | 31.2431 | 42.6061 | 42.4722 | 18.6617 |
0.4599 | 32.0 | 1348 | 0.5620 | 43.4879 | 32.0117 | 43.1924 | 43.1162 | 17.3709 |
0.4783 | 33.0 | 1390 | 0.5616 | 42.9605 | 31.3625 | 42.6897 | 42.5864 | 18.2522 |
0.4742 | 33.99 | 1432 | 0.5548 | 43.2898 | 31.6867 | 43.0984 | 42.9612 | 17.5015 |
0.4598 | 34.99 | 1474 | 0.5596 | 43.9791 | 32.2278 | 43.6851 | 43.5645 | 17.5757 |
0.4381 | 35.99 | 1516 | 0.5638 | 43.7052 | 32.2458 | 43.392 | 43.31 | 17.6261 |
0.4496 | 36.99 | 1558 | 0.5567 | 43.9806 | 32.2596 | 43.654 | 43.6041 | 17.8902 |
0.4464 | 37.98 | 1600 | 0.5615 | 43.7515 | 32.4353 | 43.5184 | 43.4019 | 17.4629 |
0.4464 | 38.98 | 1642 | 0.5625 | 43.7698 | 32.3006 | 43.4733 | 43.3957 | 17.5935 |
0.4431 | 39.88 | 1680 | 0.5629 | 43.6523 | 32.203 | 43.3772 | 43.3022 | 17.73 |
Framework versions
- Transformers 4.39.3
- Pytorch 2.2.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.1