t5-base-asqa-cb / README.md

din0s

Librarian Bot: Update dataset YAML metadata for model (#1)

4b1343e over 1 year ago

preview code

raw

history blame contribute delete

No virus

3.67 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets: din0s/asqa
	model-index:
	- name: t5-base-asqa-cb
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# t5-base-asqa-cb

	This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on the [ASQA](https://huggingface.co/datasets/din0s/asqa) dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.7489
	- Rougelsum: 26.6134

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rougelsum \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:---------:\|
	\| No log \| 1.0 \| 273 \| 2.9648 \| 23.8374 \|
	\| 3.5538 \| 2.0 \| 546 \| 2.9054 \| 24.2701 \|
	\| 3.5538 \| 3.0 \| 819 \| 2.8744 \| 24.4172 \|
	\| 3.1468 \| 4.0 \| 1092 \| 2.8557 \| 24.5949 \|
	\| 3.1468 \| 5.0 \| 1365 \| 2.8400 \| 24.7069 \|
	\| 3.0711 \| 6.0 \| 1638 \| 2.8280 \| 24.8685 \|
	\| 3.0711 \| 7.0 \| 1911 \| 2.8191 \| 24.9829 \|
	\| 3.0348 \| 8.0 \| 2184 \| 2.8109 \| 25.0908 \|
	\| 3.0348 \| 9.0 \| 2457 \| 2.8038 \| 25.2485 \|
	\| 2.9962 \| 10.0 \| 2730 \| 2.7978 \| 25.3279 \|
	\| 2.9635 \| 11.0 \| 3003 \| 2.7920 \| 25.4465 \|
	\| 2.9635 \| 12.0 \| 3276 \| 2.7878 \| 25.5927 \|
	\| 2.9328 \| 13.0 \| 3549 \| 2.7833 \| 25.6925 \|
	\| 2.9328 \| 14.0 \| 3822 \| 2.7809 \| 25.7563 \|
	\| 2.9126 \| 15.0 \| 4095 \| 2.7773 \| 25.8123 \|
	\| 2.9126 \| 16.0 \| 4368 \| 2.7747 \| 25.9039 \|
	\| 2.8878 \| 17.0 \| 4641 \| 2.7719 \| 25.9636 \|
	\| 2.8878 \| 18.0 \| 4914 \| 2.7693 \| 26.0025 \|
	\| 2.8744 \| 19.0 \| 5187 \| 2.7673 \| 26.0578 \|
	\| 2.8744 \| 20.0 \| 5460 \| 2.7656 \| 26.1161 \|
	\| 2.8579 \| 21.0 \| 5733 \| 2.7629 \| 26.1490 \|
	\| 2.8418 \| 22.0 \| 6006 \| 2.7614 \| 26.1830 \|
	\| 2.8418 \| 23.0 \| 6279 \| 2.7604 \| 26.2146 \|
	\| 2.8256 \| 24.0 \| 6552 \| 2.7586 \| 26.2899 \|
	\| 2.8256 \| 25.0 \| 6825 \| 2.7586 \| 26.2724 \|
	\| 2.8093 \| 26.0 \| 7098 \| 2.7566 \| 26.3183 \|
	\| 2.8093 \| 27.0 \| 7371 \| 2.7551 \| 26.3365 \|
	\| 2.8083 \| 28.0 \| 7644 \| 2.7546 \| 26.3950 \|
	\| 2.8083 \| 29.0 \| 7917 \| 2.7537 \| 26.4357 \|
	\| 2.7917 \| 30.0 \| 8190 \| 2.7529 \| 26.4681 \|
	\| 2.7917 \| 31.0 \| 8463 \| 2.7526 \| 26.5021 \|
	\| 2.785 \| 32.0 \| 8736 \| 2.7512 \| 26.5241 \|
	\| 2.7779 \| 33.0 \| 9009 \| 2.7510 \| 26.5361 \|
	\| 2.7779 \| 34.0 \| 9282 \| 2.7502 \| 26.5620 \|
	\| 2.771 \| 35.0 \| 9555 \| 2.7495 \| 26.6038 \|
	\| 2.771 \| 36.0 \| 9828 \| 2.7488 \| 26.6161 \|
	\| 2.7647 \| 37.0 \| 10101 \| 2.7489 \| 26.6134 \|


	### Framework versions

	- Transformers 4.23.0.dev0
	- Pytorch 1.12.1+cu102
	- Datasets 2.4.0
	- Tokenizers 0.12.1

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets: din0s/asqa
	model-index:
	- name: t5-base-asqa-cb
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# t5-base-asqa-cb

	This model is a fine-tuned version of [t5-base](https://huggingface.co/t5-base) on the [ASQA](https://huggingface.co/datasets/din0s/asqa) dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.7489
	- Rougelsum: 26.6134

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rougelsum \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:---------:\|
	\| No log \| 1.0 \| 273 \| 2.9648 \| 23.8374 \|
	\| 3.5538 \| 2.0 \| 546 \| 2.9054 \| 24.2701 \|
	\| 3.5538 \| 3.0 \| 819 \| 2.8744 \| 24.4172 \|
	\| 3.1468 \| 4.0 \| 1092 \| 2.8557 \| 24.5949 \|
	\| 3.1468 \| 5.0 \| 1365 \| 2.8400 \| 24.7069 \|
	\| 3.0711 \| 6.0 \| 1638 \| 2.8280 \| 24.8685 \|
	\| 3.0711 \| 7.0 \| 1911 \| 2.8191 \| 24.9829 \|
	\| 3.0348 \| 8.0 \| 2184 \| 2.8109 \| 25.0908 \|
	\| 3.0348 \| 9.0 \| 2457 \| 2.8038 \| 25.2485 \|
	\| 2.9962 \| 10.0 \| 2730 \| 2.7978 \| 25.3279 \|
	\| 2.9635 \| 11.0 \| 3003 \| 2.7920 \| 25.4465 \|
	\| 2.9635 \| 12.0 \| 3276 \| 2.7878 \| 25.5927 \|
	\| 2.9328 \| 13.0 \| 3549 \| 2.7833 \| 25.6925 \|
	\| 2.9328 \| 14.0 \| 3822 \| 2.7809 \| 25.7563 \|
	\| 2.9126 \| 15.0 \| 4095 \| 2.7773 \| 25.8123 \|
	\| 2.9126 \| 16.0 \| 4368 \| 2.7747 \| 25.9039 \|
	\| 2.8878 \| 17.0 \| 4641 \| 2.7719 \| 25.9636 \|
	\| 2.8878 \| 18.0 \| 4914 \| 2.7693 \| 26.0025 \|
	\| 2.8744 \| 19.0 \| 5187 \| 2.7673 \| 26.0578 \|
	\| 2.8744 \| 20.0 \| 5460 \| 2.7656 \| 26.1161 \|
	\| 2.8579 \| 21.0 \| 5733 \| 2.7629 \| 26.1490 \|
	\| 2.8418 \| 22.0 \| 6006 \| 2.7614 \| 26.1830 \|
	\| 2.8418 \| 23.0 \| 6279 \| 2.7604 \| 26.2146 \|
	\| 2.8256 \| 24.0 \| 6552 \| 2.7586 \| 26.2899 \|
	\| 2.8256 \| 25.0 \| 6825 \| 2.7586 \| 26.2724 \|
	\| 2.8093 \| 26.0 \| 7098 \| 2.7566 \| 26.3183 \|
	\| 2.8093 \| 27.0 \| 7371 \| 2.7551 \| 26.3365 \|
	\| 2.8083 \| 28.0 \| 7644 \| 2.7546 \| 26.3950 \|
	\| 2.8083 \| 29.0 \| 7917 \| 2.7537 \| 26.4357 \|
	\| 2.7917 \| 30.0 \| 8190 \| 2.7529 \| 26.4681 \|
	\| 2.7917 \| 31.0 \| 8463 \| 2.7526 \| 26.5021 \|
	\| 2.785 \| 32.0 \| 8736 \| 2.7512 \| 26.5241 \|
	\| 2.7779 \| 33.0 \| 9009 \| 2.7510 \| 26.5361 \|
	\| 2.7779 \| 34.0 \| 9282 \| 2.7502 \| 26.5620 \|
	\| 2.771 \| 35.0 \| 9555 \| 2.7495 \| 26.6038 \|
	\| 2.771 \| 36.0 \| 9828 \| 2.7488 \| 26.6161 \|
	\| 2.7647 \| 37.0 \| 10101 \| 2.7489 \| 26.6134 \|


	### Framework versions

	- Transformers 4.23.0.dev0
	- Pytorch 1.12.1+cu102
	- Datasets 2.4.0
	- Tokenizers 0.12.1