metadata

license: apache-2.0
base_model: google/mt5-small
tags:
  - summarization
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: mt5-small-finetuned-news-summary-model-2
    results: []

mt5-small-finetuned-news-summary-model-2

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.5813
Rouge1: 29.4322
Rouge2: 11.4361
Rougel: 26.3875
Rougelsum: 26.297

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 4e-05
train_batch_size: 10
eval_batch_size: 10
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 12

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
9.2632	0.9972	351	3.7059	17.3365	5.2307	15.438	15.3776
4.6719	1.9943	702	3.0896	19.5787	6.8278	18.0637	18.0255
4.1356	2.9915	1053	2.8713	22.5668	8.2899	20.551	20.5232
3.7852	3.9886	1404	2.7729	25.7974	9.9158	23.2398	23.2198
3.6194	4.9858	1755	2.7038	26.2572	10.0034	24.0326	23.9956
3.4864	5.9830	2106	2.6714	26.8149	9.9056	24.2704	24.1399
3.3965	6.9801	2457	2.6361	27.5399	10.3609	24.8286	24.7628
3.3422	7.9773	2808	2.6194	28.0298	10.6938	25.1678	25.0924
3.2879	8.9744	3159	2.5976	28.2324	10.6412	25.2803	25.1804
3.2391	9.9716	3510	2.5894	29.0155	11.174	25.9995	25.8843
3.2128	10.9688	3861	2.5854	29.3283	11.477	26.2235	26.1278
3.2214	11.9659	4212	2.5813	29.4322	11.4361	26.3875	26.297

Framework versions

Transformers 4.41.2
Pytorch 2.3.0+cu121
Datasets 2.19.2
Tokenizers 0.19.1