update model card README.md

2b5b759 over 2 years ago

No virus

4.14 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets:
	- wikihow
	metrics:
	- rouge
	model-index:
	- name: t5-small-finetuned-wikihow_3epoch_b4_lr3e-5
	results:
	- task:
	name: Sequence-to-sequence Language Modeling
	type: text2text-generation
	dataset:
	name: wikihow
	type: wikihow
	args: all
	metrics:
	- name: Rouge1
	type: rouge
	value: 26.1071
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# t5-small-finetuned-wikihow_3epoch_b4_lr3e-5

	This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the wikihow dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.4351
	- Rouge1: 26.1071
	- Rouge2: 9.3627
	- Rougel: 22.0825
	- Rougelsum: 25.4514
	- Gen Len: 18.474

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:-------:\|:------:\|:-------:\|:---------:\|:-------:\|
	\| 2.9216 \| 0.13 \| 5000 \| 2.6385 \| 23.8039 \| 7.8863 \| 20.0109 \| 23.0802 \| 18.3481 \|
	\| 2.8158 \| 0.25 \| 10000 \| 2.5884 \| 24.2567 \| 8.2003 \| 20.438 \| 23.5325 \| 18.3833 \|
	\| 2.7743 \| 0.38 \| 15000 \| 2.5623 \| 24.8471 \| 8.3768 \| 20.8711 \| 24.1114 \| 18.2901 \|
	\| 2.7598 \| 0.51 \| 20000 \| 2.5368 \| 25.1566 \| 8.6721 \| 21.1896 \| 24.4558 \| 18.3561 \|
	\| 2.7192 \| 0.64 \| 25000 \| 2.5220 \| 25.3477 \| 8.8106 \| 21.3799 \| 24.6742 \| 18.3108 \|
	\| 2.7207 \| 0.76 \| 30000 \| 2.5114 \| 25.5912 \| 8.998 \| 21.5508 \| 24.9344 \| 18.3445 \|
	\| 2.7041 \| 0.89 \| 35000 \| 2.4993 \| 25.457 \| 8.8644 \| 21.4516 \| 24.7965 \| 18.4354 \|
	\| 2.687 \| 1.02 \| 40000 \| 2.4879 \| 25.5886 \| 8.9766 \| 21.6794 \| 24.9512 \| 18.4035 \|
	\| 2.6652 \| 1.14 \| 45000 \| 2.4848 \| 25.7367 \| 9.078 \| 21.7096 \| 25.0924 \| 18.4328 \|
	\| 2.6536 \| 1.27 \| 50000 \| 2.4761 \| 25.7368 \| 9.1609 \| 21.729 \| 25.0866 \| 18.3117 \|
	\| 2.6589 \| 1.4 \| 55000 \| 2.4702 \| 25.7738 \| 9.1413 \| 21.7492 \| 25.114 \| 18.4862 \|
	\| 2.6384 \| 1.53 \| 60000 \| 2.4620 \| 25.7433 \| 9.1356 \| 21.8198 \| 25.0896 \| 18.489 \|
	\| 2.6337 \| 1.65 \| 65000 \| 2.4595 \| 26.0919 \| 9.2605 \| 21.9447 \| 25.4065 \| 18.4083 \|
	\| 2.6375 \| 1.78 \| 70000 \| 2.4557 \| 26.0912 \| 9.3469 \| 22.0182 \| 25.4428 \| 18.4133 \|
	\| 2.6441 \| 1.91 \| 75000 \| 2.4502 \| 26.1366 \| 9.3143 \| 22.058 \| 25.4673 \| 18.4972 \|
	\| 2.6276 \| 2.03 \| 80000 \| 2.4478 \| 25.9929 \| 9.2464 \| 21.9271 \| 25.3263 \| 18.469 \|
	\| 2.6062 \| 2.16 \| 85000 \| 2.4467 \| 26.0465 \| 9.3166 \| 22.0342 \| 25.3998 \| 18.3777 \|
	\| 2.6126 \| 2.29 \| 90000 \| 2.4407 \| 26.1953 \| 9.3848 \| 22.1148 \| 25.5161 \| 18.467 \|
	\| 2.6182 \| 2.42 \| 95000 \| 2.4397 \| 26.1331 \| 9.3626 \| 22.1076 \| 25.4627 \| 18.4413 \|
	\| 2.6041 \| 2.54 \| 100000 \| 2.4375 \| 26.1301 \| 9.3567 \| 22.0869 \| 25.465 \| 18.4929 \|
	\| 2.5996 \| 2.67 \| 105000 \| 2.4367 \| 26.0956 \| 9.3314 \| 22.063 \| 25.4242 \| 18.5074 \|
	\| 2.6144 \| 2.8 \| 110000 \| 2.4355 \| 26.1764 \| 9.4157 \| 22.1231 \| 25.5175 \| 18.4729 \|
	\| 2.608 \| 2.93 \| 115000 \| 2.4351 \| 26.1071 \| 9.3627 \| 22.0825 \| 25.4514 \| 18.474 \|


	### Framework versions

	- Transformers 4.18.0
	- Pytorch 1.10.0+cu111
	- Datasets 2.0.0
	- Tokenizers 0.11.6