mqy
/

mt5-small-text-sum-2

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-small-text-sum-2 / README.md

mqy's picture

mqy

update model card README.md

8ec69ab over 1 year ago

|

history blame contribute delete

No virus

3.32 kB

	---
	license: apache-2.0
	tags:
	- summarization
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: mt5-small-text-sum-2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mt5-small-text-sum-2

	This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.3612
	- Rouge1: 21.38
	- Rouge2: 6.57
	- Rougel: 21.08

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 9
	- eval_batch_size: 9
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 40

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:------:\|:------:\|:------:\|
	\| 4.7204 \| 1.45 \| 500 \| 2.6053 \| 16.9 \| 4.9 \| 16.73 \|
	\| 3.1289 \| 2.9 \| 1000 \| 2.4878 \| 17.96 \| 5.26 \| 17.82 \|
	\| 2.8862 \| 4.35 \| 1500 \| 2.4109 \| 17.4 \| 5.08 \| 17.14 \|
	\| 2.7669 \| 5.8 \| 2000 \| 2.4006 \| 18.53 \| 5.29 \| 18.21 \|
	\| 2.6433 \| 7.25 \| 2500 \| 2.4017 \| 18.69 \| 5.71 \| 18.53 \|
	\| 2.5514 \| 8.7 \| 3000 \| 2.3917 \| 19.32 \| 5.89 \| 19.12 \|
	\| 2.4947 \| 10.14 \| 3500 \| 2.3994 \| 20.56 \| 6.08 \| 20.19 \|
	\| 2.3995 \| 11.59 \| 4000 \| 2.3608 \| 20.11 \| 6.52 \| 19.75 \|
	\| 2.3798 \| 13.04 \| 4500 \| 2.3251 \| 19.98 \| 6.26 \| 19.76 \|
	\| 2.3029 \| 14.49 \| 5000 \| 2.3387 \| 19.71 \| 6.11 \| 19.42 \|
	\| 2.2563 \| 15.94 \| 5500 \| 2.3372 \| 20.18 \| 6.34 \| 19.8 \|
	\| 2.2109 \| 17.39 \| 6000 \| 2.3410 \| 20.58 \| 6.35 \| 20.14 \|
	\| 2.166 \| 18.84 \| 6500 \| 2.3432 \| 20.93 \| 6.5 \| 20.63 \|
	\| 2.1283 \| 20.29 \| 7000 \| 2.3404 \| 21.0 \| 6.5 \| 20.73 \|
	\| 2.1054 \| 21.74 \| 7500 \| 2.3563 \| 20.95 \| 6.54 \| 20.48 \|
	\| 2.0658 \| 23.19 \| 8000 \| 2.3575 \| 19.73 \| 6.18 \| 19.54 \|
	\| 2.0461 \| 24.64 \| 8500 \| 2.3382 \| 20.78 \| 6.42 \| 20.52 \|
	\| 2.0135 \| 26.09 \| 9000 \| 2.3628 \| 20.94 \| 6.55 \| 20.66 \|
	\| 2.0122 \| 27.54 \| 9500 \| 2.3725 \| 21.1 \| 6.87 \| 20.96 \|
	\| 1.9623 \| 28.99 \| 10000 \| 2.3612 \| 21.38 \| 6.57 \| 21.08 \|
	\| 1.9518 \| 30.43 \| 10500 \| 2.3619 \| 20.12 \| 6.25 \| 19.8 \|
	\| 1.9327 \| 31.88 \| 11000 \| 2.3642 \| 20.9 \| 6.6 \| 20.55 \|
	\| 1.9147 \| 33.33 \| 11500 \| 2.3703 \| 21.0 \| 6.37 \| 20.59 \|
	\| 1.9145 \| 34.78 \| 12000 \| 2.3823 \| 21.24 \| 6.84 \| 20.92 \|
	\| 1.9065 \| 36.23 \| 12500 \| 2.3686 \| 20.16 \| 6.41 \| 19.87 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.13.1+cu116
	- Datasets 2.10.1
	- Tokenizers 0.13.2