msarmento
/

mt5-small-finetuned-xlsum-pt

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-small-finetuned-xlsum-pt / README.md

msarmento's picture

Training complete

e4dd898 verified 4 months ago

|

raw history blame contribute delete

No virus

2.38 kB

	---
	license: apache-2.0
	base_model: google/mt5-small
	tags:
	- summarization
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: mt5-small-finetuned-xlsum-pt
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mt5-small-finetuned-xlsum-pt

	This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0986
	- Rouge1: 16.5756
	- Rouge2: 13.7639
	- Rougel: 15.7445
	- Rougelsum: 16.5112

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5.6e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|
	\| 0.7681 \| 1.0 \| 125 \| 0.1393 \| 12.9432 \| 9.5039 \| 12.2871 \| 12.7291 \|
	\| 0.5282 \| 2.0 \| 250 \| 0.1231 \| 13.4575 \| 10.0697 \| 12.6449 \| 13.2 \|
	\| 0.4132 \| 3.0 \| 375 \| 0.1134 \| 16.6964 \| 14.0187 \| 15.7338 \| 16.6025 \|
	\| 0.3534 \| 4.0 \| 500 \| 0.1077 \| 16.8961 \| 14.2203 \| 15.9187 \| 16.7712 \|
	\| 0.3126 \| 5.0 \| 625 \| 0.1039 \| 16.993 \| 14.0876 \| 15.8914 \| 16.9277 \|
	\| 0.283 \| 6.0 \| 750 \| 0.1023 \| 16.7431 \| 13.9453 \| 15.8758 \| 16.6413 \|
	\| 0.2675 \| 7.0 \| 875 \| 0.1008 \| 16.6566 \| 13.8639 \| 15.775 \| 16.5481 \|
	\| 0.2509 \| 8.0 \| 1000 \| 0.0987 \| 16.6829 \| 13.935 \| 15.872 \| 16.6222 \|
	\| 0.2441 \| 9.0 \| 1125 \| 0.0987 \| 16.6085 \| 13.7884 \| 15.7896 \| 16.5412 \|
	\| 0.2401 \| 10.0 \| 1250 \| 0.0986 \| 16.5756 \| 13.7639 \| 15.7445 \| 16.5112 \|


	### Framework versions

	- Transformers 4.37.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.17.1
	- Tokenizers 0.15.2