codeparrot-ds / README.md

yangdechuan

End of training

7eb262a about 1 year ago

preview code

raw

history blame contribute delete

No virus

4.67 kB

	---
	license: mit
	base_model: gpt2
	tags:
	- generated_from_trainer
	model-index:
	- name: codeparrot-ds
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# codeparrot-ds

	This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.0621

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 256
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 1000
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 3.2102 \| 0.02 \| 1000 \| 2.7478 \|
	\| 2.359 \| 0.03 \| 2000 \| 2.2031 \|
	\| 2.0974 \| 0.05 \| 3000 \| 1.9751 \|
	\| 1.9383 \| 0.06 \| 4000 \| 1.8321 \|
	\| 1.8346 \| 0.08 \| 5000 \| 1.7406 \|
	\| 1.7547 \| 0.09 \| 6000 \| 1.6731 \|
	\| 1.6994 \| 0.11 \| 7000 \| 1.6212 \|
	\| 1.6632 \| 0.12 \| 8000 \| 1.5842 \|
	\| 1.6237 \| 0.14 \| 9000 \| 1.5506 \|
	\| 1.5986 \| 0.15 \| 10000 \| 1.5247 \|
	\| 1.5749 \| 0.17 \| 11000 \| 1.4994 \|
	\| 1.5466 \| 0.18 \| 12000 \| 1.4783 \|
	\| 1.5254 \| 0.2 \| 13000 \| 1.4579 \|
	\| 1.5085 \| 0.21 \| 14000 \| 1.4420 \|
	\| 1.4884 \| 0.23 \| 15000 \| 1.4235 \|
	\| 1.4842 \| 0.25 \| 16000 \| 1.4088 \|
	\| 1.4618 \| 0.26 \| 17000 \| 1.3957 \|
	\| 1.4479 \| 0.28 \| 18000 \| 1.3825 \|
	\| 1.4376 \| 0.29 \| 19000 \| 1.3716 \|
	\| 1.4225 \| 0.31 \| 20000 \| 1.3583 \|
	\| 1.4151 \| 0.32 \| 21000 \| 1.3476 \|
	\| 1.4021 \| 0.34 \| 22000 \| 1.3359 \|
	\| 1.3956 \| 0.35 \| 23000 \| 1.3245 \|
	\| 1.3839 \| 0.37 \| 24000 \| 1.3159 \|
	\| 1.3741 \| 0.38 \| 25000 \| 1.3060 \|
	\| 1.3635 \| 0.4 \| 26000 \| 1.2950 \|
	\| 1.3491 \| 0.41 \| 27000 \| 1.2844 \|
	\| 1.3462 \| 0.43 \| 28000 \| 1.2760 \|
	\| 1.3317 \| 0.44 \| 29000 \| 1.2676 \|
	\| 1.3249 \| 0.46 \| 30000 \| 1.2584 \|
	\| 1.3164 \| 0.48 \| 31000 \| 1.2486 \|
	\| 1.3055 \| 0.49 \| 32000 \| 1.2406 \|
	\| 1.3006 \| 0.51 \| 33000 \| 1.2327 \|
	\| 1.2906 \| 0.52 \| 34000 \| 1.2225 \|
	\| 1.2821 \| 0.54 \| 35000 \| 1.2135 \|
	\| 1.2677 \| 0.55 \| 36000 \| 1.2068 \|
	\| 1.2562 \| 0.57 \| 37000 \| 1.1981 \|
	\| 1.2541 \| 0.58 \| 38000 \| 1.1896 \|
	\| 1.2377 \| 0.6 \| 39000 \| 1.1814 \|
	\| 1.2346 \| 0.61 \| 40000 \| 1.1726 \|
	\| 1.2251 \| 0.63 \| 41000 \| 1.1647 \|
	\| 1.2175 \| 0.64 \| 42000 \| 1.1575 \|
	\| 1.2112 \| 0.66 \| 43000 \| 1.1486 \|
	\| 1.2021 \| 0.67 \| 44000 \| 1.1410 \|
	\| 1.1888 \| 0.69 \| 45000 \| 1.1339 \|
	\| 1.1939 \| 0.71 \| 46000 \| 1.1259 \|
	\| 1.18 \| 0.72 \| 47000 \| 1.1198 \|
	\| 1.1698 \| 0.74 \| 48000 \| 1.1130 \|
	\| 1.1634 \| 0.75 \| 49000 \| 1.1063 \|
	\| 1.1593 \| 0.77 \| 50000 \| 1.1006 \|
	\| 1.1545 \| 0.78 \| 51000 \| 1.0946 \|
	\| 1.1478 \| 0.8 \| 52000 \| 1.0896 \|
	\| 1.1443 \| 0.81 \| 53000 \| 1.0855 \|
	\| 1.1365 \| 0.83 \| 54000 \| 1.0808 \|
	\| 1.1332 \| 0.84 \| 55000 \| 1.0773 \|
	\| 1.1336 \| 0.86 \| 56000 \| 1.0736 \|
	\| 1.1276 \| 0.87 \| 57000 \| 1.0711 \|
	\| 1.1241 \| 0.89 \| 58000 \| 1.0686 \|
	\| 1.123 \| 0.9 \| 59000 \| 1.0665 \|
	\| 1.1187 \| 0.92 \| 60000 \| 1.0647 \|
	\| 1.1123 \| 0.93 \| 61000 \| 1.0636 \|
	\| 1.1159 \| 0.95 \| 62000 \| 1.0628 \|
	\| 1.1133 \| 0.97 \| 63000 \| 1.0623 \|
	\| 1.1181 \| 0.98 \| 64000 \| 1.0621 \|
	\| 1.1125 \| 1.0 \| 65000 \| 1.0621 \|


	### Framework versions

	- Transformers 4.33.0.dev0
	- Pytorch 2.0.0+cu117
	- Datasets 2.14.4
	- Tokenizers 0.13.3

	---
	license: mit
	base_model: gpt2
	tags:
	- generated_from_trainer
	model-index:
	- name: codeparrot-ds
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# codeparrot-ds

	This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.0621

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 256
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 1000
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 3.2102 \| 0.02 \| 1000 \| 2.7478 \|
	\| 2.359 \| 0.03 \| 2000 \| 2.2031 \|
	\| 2.0974 \| 0.05 \| 3000 \| 1.9751 \|
	\| 1.9383 \| 0.06 \| 4000 \| 1.8321 \|
	\| 1.8346 \| 0.08 \| 5000 \| 1.7406 \|
	\| 1.7547 \| 0.09 \| 6000 \| 1.6731 \|
	\| 1.6994 \| 0.11 \| 7000 \| 1.6212 \|
	\| 1.6632 \| 0.12 \| 8000 \| 1.5842 \|
	\| 1.6237 \| 0.14 \| 9000 \| 1.5506 \|
	\| 1.5986 \| 0.15 \| 10000 \| 1.5247 \|
	\| 1.5749 \| 0.17 \| 11000 \| 1.4994 \|
	\| 1.5466 \| 0.18 \| 12000 \| 1.4783 \|
	\| 1.5254 \| 0.2 \| 13000 \| 1.4579 \|
	\| 1.5085 \| 0.21 \| 14000 \| 1.4420 \|
	\| 1.4884 \| 0.23 \| 15000 \| 1.4235 \|
	\| 1.4842 \| 0.25 \| 16000 \| 1.4088 \|
	\| 1.4618 \| 0.26 \| 17000 \| 1.3957 \|
	\| 1.4479 \| 0.28 \| 18000 \| 1.3825 \|
	\| 1.4376 \| 0.29 \| 19000 \| 1.3716 \|
	\| 1.4225 \| 0.31 \| 20000 \| 1.3583 \|
	\| 1.4151 \| 0.32 \| 21000 \| 1.3476 \|
	\| 1.4021 \| 0.34 \| 22000 \| 1.3359 \|
	\| 1.3956 \| 0.35 \| 23000 \| 1.3245 \|
	\| 1.3839 \| 0.37 \| 24000 \| 1.3159 \|
	\| 1.3741 \| 0.38 \| 25000 \| 1.3060 \|
	\| 1.3635 \| 0.4 \| 26000 \| 1.2950 \|
	\| 1.3491 \| 0.41 \| 27000 \| 1.2844 \|
	\| 1.3462 \| 0.43 \| 28000 \| 1.2760 \|
	\| 1.3317 \| 0.44 \| 29000 \| 1.2676 \|
	\| 1.3249 \| 0.46 \| 30000 \| 1.2584 \|
	\| 1.3164 \| 0.48 \| 31000 \| 1.2486 \|
	\| 1.3055 \| 0.49 \| 32000 \| 1.2406 \|
	\| 1.3006 \| 0.51 \| 33000 \| 1.2327 \|
	\| 1.2906 \| 0.52 \| 34000 \| 1.2225 \|
	\| 1.2821 \| 0.54 \| 35000 \| 1.2135 \|
	\| 1.2677 \| 0.55 \| 36000 \| 1.2068 \|
	\| 1.2562 \| 0.57 \| 37000 \| 1.1981 \|
	\| 1.2541 \| 0.58 \| 38000 \| 1.1896 \|
	\| 1.2377 \| 0.6 \| 39000 \| 1.1814 \|
	\| 1.2346 \| 0.61 \| 40000 \| 1.1726 \|
	\| 1.2251 \| 0.63 \| 41000 \| 1.1647 \|
	\| 1.2175 \| 0.64 \| 42000 \| 1.1575 \|
	\| 1.2112 \| 0.66 \| 43000 \| 1.1486 \|
	\| 1.2021 \| 0.67 \| 44000 \| 1.1410 \|
	\| 1.1888 \| 0.69 \| 45000 \| 1.1339 \|
	\| 1.1939 \| 0.71 \| 46000 \| 1.1259 \|
	\| 1.18 \| 0.72 \| 47000 \| 1.1198 \|
	\| 1.1698 \| 0.74 \| 48000 \| 1.1130 \|
	\| 1.1634 \| 0.75 \| 49000 \| 1.1063 \|
	\| 1.1593 \| 0.77 \| 50000 \| 1.1006 \|
	\| 1.1545 \| 0.78 \| 51000 \| 1.0946 \|
	\| 1.1478 \| 0.8 \| 52000 \| 1.0896 \|
	\| 1.1443 \| 0.81 \| 53000 \| 1.0855 \|
	\| 1.1365 \| 0.83 \| 54000 \| 1.0808 \|
	\| 1.1332 \| 0.84 \| 55000 \| 1.0773 \|
	\| 1.1336 \| 0.86 \| 56000 \| 1.0736 \|
	\| 1.1276 \| 0.87 \| 57000 \| 1.0711 \|
	\| 1.1241 \| 0.89 \| 58000 \| 1.0686 \|
	\| 1.123 \| 0.9 \| 59000 \| 1.0665 \|
	\| 1.1187 \| 0.92 \| 60000 \| 1.0647 \|
	\| 1.1123 \| 0.93 \| 61000 \| 1.0636 \|
	\| 1.1159 \| 0.95 \| 62000 \| 1.0628 \|
	\| 1.1133 \| 0.97 \| 63000 \| 1.0623 \|
	\| 1.1181 \| 0.98 \| 64000 \| 1.0621 \|
	\| 1.1125 \| 1.0 \| 65000 \| 1.0621 \|


	### Framework versions

	- Transformers 4.33.0.dev0
	- Pytorch 2.0.0+cu117
	- Datasets 2.14.4
	- Tokenizers 0.13.3