update model card README.md

c735303 about 1 year ago

3.42 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	datasets:
	- generator
	model-index:
	- name: gpt2_left_out_switchboard
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# gpt2_left_out_switchboard

	This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the generator dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.9378

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 1000
	- num_epochs: 10
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 5.983 \| 0.24 \| 500 \| 5.0786 \|
	\| 4.7603 \| 0.48 \| 1000 \| 4.6865 \|
	\| 4.4521 \| 0.73 \| 1500 \| 4.4635 \|
	\| 4.2512 \| 0.97 \| 2000 \| 4.3124 \|
	\| 4.0458 \| 1.21 \| 2500 \| 4.2272 \|
	\| 3.9687 \| 1.45 \| 3000 \| 4.1443 \|
	\| 3.9024 \| 1.69 \| 3500 \| 4.0705 \|
	\| 3.8439 \| 1.93 \| 4000 \| 4.0057 \|
	\| 3.6791 \| 2.18 \| 4500 \| 3.9845 \|
	\| 3.6259 \| 2.42 \| 5000 \| 3.9471 \|
	\| 3.6137 \| 2.66 \| 5500 \| 3.9057 \|
	\| 3.592 \| 2.9 \| 6000 \| 3.8654 \|
	\| 3.4438 \| 3.14 \| 6500 \| 3.8758 \|
	\| 3.3844 \| 3.38 \| 7000 \| 3.8570 \|
	\| 3.3977 \| 3.63 \| 7500 \| 3.8324 \|
	\| 3.4015 \| 3.87 \| 8000 \| 3.8053 \|
	\| 3.2638 \| 4.11 \| 8500 \| 3.8300 \|
	\| 3.1771 \| 4.35 \| 9000 \| 3.8250 \|
	\| 3.1914 \| 4.59 \| 9500 \| 3.8070 \|
	\| 3.1993 \| 4.84 \| 10000 \| 3.7853 \|
	\| 3.1089 \| 5.08 \| 10500 \| 3.8146 \|
	\| 2.9539 \| 5.32 \| 11000 \| 3.8262 \|
	\| 2.9853 \| 5.56 \| 11500 \| 3.8173 \|
	\| 2.9984 \| 5.8 \| 12000 \| 3.8020 \|
	\| 2.9462 \| 6.04 \| 12500 \| 3.8259 \|
	\| 2.7343 \| 6.29 \| 13000 \| 3.8527 \|
	\| 2.7724 \| 6.53 \| 13500 \| 3.8499 \|
	\| 2.7817 \| 6.77 \| 14000 \| 3.8423 \|
	\| 2.7789 \| 7.01 \| 14500 \| 3.8510 \|
	\| 2.5477 \| 7.25 \| 15000 \| 3.8873 \|
	\| 2.5643 \| 7.5 \| 15500 \| 3.8904 \|
	\| 2.5842 \| 7.74 \| 16000 \| 3.8896 \|
	\| 2.5913 \| 7.98 \| 16500 \| 3.8858 \|
	\| 2.4293 \| 8.22 \| 17000 \| 3.9177 \|
	\| 2.4253 \| 8.46 \| 17500 \| 3.9231 \|
	\| 2.4274 \| 8.7 \| 18000 \| 3.9240 \|
	\| 2.4331 \| 8.95 \| 18500 \| 3.9254 \|
	\| 2.362 \| 9.19 \| 19000 \| 3.9346 \|
	\| 2.3519 \| 9.43 \| 19500 \| 3.9373 \|
	\| 2.3498 \| 9.67 \| 20000 \| 3.9378 \|
	\| 2.3461 \| 9.91 \| 20500 \| 3.9378 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.11.0+cu113
	- Datasets 2.13.0
	- Tokenizers 0.13.3

	---
	license: mit
	tags:
	- generated_from_trainer
	datasets:
	- generator
	model-index:
	- name: gpt2_left_out_switchboard
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# gpt2_left_out_switchboard

	This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the generator dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.9378

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 1000
	- num_epochs: 10
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 5.983 \| 0.24 \| 500 \| 5.0786 \|
	\| 4.7603 \| 0.48 \| 1000 \| 4.6865 \|
	\| 4.4521 \| 0.73 \| 1500 \| 4.4635 \|
	\| 4.2512 \| 0.97 \| 2000 \| 4.3124 \|
	\| 4.0458 \| 1.21 \| 2500 \| 4.2272 \|
	\| 3.9687 \| 1.45 \| 3000 \| 4.1443 \|
	\| 3.9024 \| 1.69 \| 3500 \| 4.0705 \|
	\| 3.8439 \| 1.93 \| 4000 \| 4.0057 \|
	\| 3.6791 \| 2.18 \| 4500 \| 3.9845 \|
	\| 3.6259 \| 2.42 \| 5000 \| 3.9471 \|
	\| 3.6137 \| 2.66 \| 5500 \| 3.9057 \|
	\| 3.592 \| 2.9 \| 6000 \| 3.8654 \|
	\| 3.4438 \| 3.14 \| 6500 \| 3.8758 \|
	\| 3.3844 \| 3.38 \| 7000 \| 3.8570 \|
	\| 3.3977 \| 3.63 \| 7500 \| 3.8324 \|
	\| 3.4015 \| 3.87 \| 8000 \| 3.8053 \|
	\| 3.2638 \| 4.11 \| 8500 \| 3.8300 \|
	\| 3.1771 \| 4.35 \| 9000 \| 3.8250 \|
	\| 3.1914 \| 4.59 \| 9500 \| 3.8070 \|
	\| 3.1993 \| 4.84 \| 10000 \| 3.7853 \|
	\| 3.1089 \| 5.08 \| 10500 \| 3.8146 \|
	\| 2.9539 \| 5.32 \| 11000 \| 3.8262 \|
	\| 2.9853 \| 5.56 \| 11500 \| 3.8173 \|
	\| 2.9984 \| 5.8 \| 12000 \| 3.8020 \|
	\| 2.9462 \| 6.04 \| 12500 \| 3.8259 \|
	\| 2.7343 \| 6.29 \| 13000 \| 3.8527 \|
	\| 2.7724 \| 6.53 \| 13500 \| 3.8499 \|
	\| 2.7817 \| 6.77 \| 14000 \| 3.8423 \|
	\| 2.7789 \| 7.01 \| 14500 \| 3.8510 \|
	\| 2.5477 \| 7.25 \| 15000 \| 3.8873 \|
	\| 2.5643 \| 7.5 \| 15500 \| 3.8904 \|
	\| 2.5842 \| 7.74 \| 16000 \| 3.8896 \|
	\| 2.5913 \| 7.98 \| 16500 \| 3.8858 \|
	\| 2.4293 \| 8.22 \| 17000 \| 3.9177 \|
	\| 2.4253 \| 8.46 \| 17500 \| 3.9231 \|
	\| 2.4274 \| 8.7 \| 18000 \| 3.9240 \|
	\| 2.4331 \| 8.95 \| 18500 \| 3.9254 \|
	\| 2.362 \| 9.19 \| 19000 \| 3.9346 \|
	\| 2.3519 \| 9.43 \| 19500 \| 3.9373 \|
	\| 2.3498 \| 9.67 \| 20000 \| 3.9378 \|
	\| 2.3461 \| 9.91 \| 20500 \| 3.9378 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.11.0+cu113
	- Datasets 2.13.0
	- Tokenizers 0.13.3