update model card README.md

bacb1d4 about 1 year ago

3.98 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	model-index:
	- name: t5-end2end-questions-generation_2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# t5-end2end-questions-generation_2

	This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6223

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 4
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 7

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.7103 \| 0.13 \| 10 \| 1.7584 \|
	\| 1.8298 \| 0.26 \| 20 \| 1.3377 \|
	\| 1.4424 \| 0.39 \| 30 \| 1.1610 \|
	\| 1.4063 \| 0.52 \| 40 \| 1.0564 \|
	\| 1.2738 \| 0.65 \| 50 \| 1.0332 \|
	\| 1.2477 \| 0.78 \| 60 \| 0.9531 \|
	\| 1.146 \| 0.91 \| 70 \| 0.9050 \|
	\| 1.0134 \| 1.04 \| 80 \| 0.9388 \|
	\| 0.8782 \| 1.17 \| 90 \| 0.9215 \|
	\| 0.8869 \| 1.3 \| 100 \| 0.8930 \|
	\| 0.8963 \| 1.43 \| 110 \| 0.8996 \|
	\| 0.9138 \| 1.56 \| 120 \| 0.8616 \|
	\| 0.7963 \| 1.69 \| 130 \| 0.8060 \|
	\| 0.8611 \| 1.82 \| 140 \| 0.7611 \|
	\| 1.0504 \| 1.95 \| 150 \| 0.7606 \|
	\| 0.6802 \| 2.08 \| 160 \| 0.7791 \|
	\| 0.7488 \| 2.21 \| 170 \| 0.7470 \|
	\| 0.6659 \| 2.34 \| 180 \| 0.7367 \|
	\| 0.7061 \| 2.47 \| 190 \| 0.7194 \|
	\| 0.6771 \| 2.6 \| 200 \| 0.7006 \|
	\| 0.7267 \| 2.73 \| 210 \| 0.6858 \|
	\| 0.7251 \| 2.86 \| 220 \| 0.6797 \|
	\| 0.7426 \| 2.99 \| 230 \| 0.6877 \|
	\| 0.5425 \| 3.12 \| 240 \| 0.6994 \|
	\| 0.5298 \| 3.25 \| 250 \| 0.7096 \|
	\| 0.697 \| 3.38 \| 260 \| 0.6941 \|
	\| 0.5643 \| 3.51 \| 270 \| 0.6534 \|
	\| 0.6983 \| 3.64 \| 280 \| 0.6407 \|
	\| 0.587 \| 3.77 \| 290 \| 0.6404 \|
	\| 0.6487 \| 3.9 \| 300 \| 0.6489 \|
	\| 0.5862 \| 4.03 \| 310 \| 0.6567 \|
	\| 0.5524 \| 4.16 \| 320 \| 0.6610 \|
	\| 0.5432 \| 4.29 \| 330 \| 0.6609 \|
	\| 0.5165 \| 4.42 \| 340 \| 0.6558 \|
	\| 0.5248 \| 4.55 \| 350 \| 0.6387 \|
	\| 0.5322 \| 4.68 \| 360 \| 0.6319 \|
	\| 0.5272 \| 4.81 \| 370 \| 0.6214 \|
	\| 0.5555 \| 4.94 \| 380 \| 0.6252 \|
	\| 0.597 \| 5.06 \| 390 \| 0.6281 \|
	\| 0.5745 \| 5.19 \| 400 \| 0.6283 \|
	\| 0.5156 \| 5.32 \| 410 \| 0.6265 \|
	\| 0.4898 \| 5.45 \| 420 \| 0.6307 \|
	\| 0.543 \| 5.58 \| 430 \| 0.6280 \|
	\| 0.5094 \| 5.71 \| 440 \| 0.6295 \|
	\| 0.5023 \| 5.84 \| 450 \| 0.6279 \|
	\| 0.4483 \| 5.97 \| 460 \| 0.6228 \|
	\| 0.5134 \| 6.1 \| 470 \| 0.6239 \|
	\| 0.5054 \| 6.23 \| 480 \| 0.6230 \|
	\| 0.4632 \| 6.36 \| 490 \| 0.6205 \|
	\| 0.5016 \| 6.49 \| 500 \| 0.6212 \|
	\| 0.4838 \| 6.62 \| 510 \| 0.6219 \|
	\| 0.4613 \| 6.75 \| 520 \| 0.6225 \|
	\| 0.5062 \| 6.88 \| 530 \| 0.6223 \|


	### Framework versions

	- Transformers 4.28.1
	- Pytorch 2.0.1+cu117
	- Datasets 2.12.0
	- Tokenizers 0.13.3

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	model-index:
	- name: t5-end2end-questions-generation_2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# t5-end2end-questions-generation_2

	This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6223

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 4
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 7

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.7103 \| 0.13 \| 10 \| 1.7584 \|
	\| 1.8298 \| 0.26 \| 20 \| 1.3377 \|
	\| 1.4424 \| 0.39 \| 30 \| 1.1610 \|
	\| 1.4063 \| 0.52 \| 40 \| 1.0564 \|
	\| 1.2738 \| 0.65 \| 50 \| 1.0332 \|
	\| 1.2477 \| 0.78 \| 60 \| 0.9531 \|
	\| 1.146 \| 0.91 \| 70 \| 0.9050 \|
	\| 1.0134 \| 1.04 \| 80 \| 0.9388 \|
	\| 0.8782 \| 1.17 \| 90 \| 0.9215 \|
	\| 0.8869 \| 1.3 \| 100 \| 0.8930 \|
	\| 0.8963 \| 1.43 \| 110 \| 0.8996 \|
	\| 0.9138 \| 1.56 \| 120 \| 0.8616 \|
	\| 0.7963 \| 1.69 \| 130 \| 0.8060 \|
	\| 0.8611 \| 1.82 \| 140 \| 0.7611 \|
	\| 1.0504 \| 1.95 \| 150 \| 0.7606 \|
	\| 0.6802 \| 2.08 \| 160 \| 0.7791 \|
	\| 0.7488 \| 2.21 \| 170 \| 0.7470 \|
	\| 0.6659 \| 2.34 \| 180 \| 0.7367 \|
	\| 0.7061 \| 2.47 \| 190 \| 0.7194 \|
	\| 0.6771 \| 2.6 \| 200 \| 0.7006 \|
	\| 0.7267 \| 2.73 \| 210 \| 0.6858 \|
	\| 0.7251 \| 2.86 \| 220 \| 0.6797 \|
	\| 0.7426 \| 2.99 \| 230 \| 0.6877 \|
	\| 0.5425 \| 3.12 \| 240 \| 0.6994 \|
	\| 0.5298 \| 3.25 \| 250 \| 0.7096 \|
	\| 0.697 \| 3.38 \| 260 \| 0.6941 \|
	\| 0.5643 \| 3.51 \| 270 \| 0.6534 \|
	\| 0.6983 \| 3.64 \| 280 \| 0.6407 \|
	\| 0.587 \| 3.77 \| 290 \| 0.6404 \|
	\| 0.6487 \| 3.9 \| 300 \| 0.6489 \|
	\| 0.5862 \| 4.03 \| 310 \| 0.6567 \|
	\| 0.5524 \| 4.16 \| 320 \| 0.6610 \|
	\| 0.5432 \| 4.29 \| 330 \| 0.6609 \|
	\| 0.5165 \| 4.42 \| 340 \| 0.6558 \|
	\| 0.5248 \| 4.55 \| 350 \| 0.6387 \|
	\| 0.5322 \| 4.68 \| 360 \| 0.6319 \|
	\| 0.5272 \| 4.81 \| 370 \| 0.6214 \|
	\| 0.5555 \| 4.94 \| 380 \| 0.6252 \|
	\| 0.597 \| 5.06 \| 390 \| 0.6281 \|
	\| 0.5745 \| 5.19 \| 400 \| 0.6283 \|
	\| 0.5156 \| 5.32 \| 410 \| 0.6265 \|
	\| 0.4898 \| 5.45 \| 420 \| 0.6307 \|
	\| 0.543 \| 5.58 \| 430 \| 0.6280 \|
	\| 0.5094 \| 5.71 \| 440 \| 0.6295 \|
	\| 0.5023 \| 5.84 \| 450 \| 0.6279 \|
	\| 0.4483 \| 5.97 \| 460 \| 0.6228 \|
	\| 0.5134 \| 6.1 \| 470 \| 0.6239 \|
	\| 0.5054 \| 6.23 \| 480 \| 0.6230 \|
	\| 0.4632 \| 6.36 \| 490 \| 0.6205 \|
	\| 0.5016 \| 6.49 \| 500 \| 0.6212 \|
	\| 0.4838 \| 6.62 \| 510 \| 0.6219 \|
	\| 0.4613 \| 6.75 \| 520 \| 0.6225 \|
	\| 0.5062 \| 6.88 \| 530 \| 0.6223 \|


	### Framework versions

	- Transformers 4.28.1
	- Pytorch 2.0.1+cu117
	- Datasets 2.12.0
	- Tokenizers 0.13.3