add model

01c2742 almost 2 years ago

No virus

5.27 kB

	---
	tags:
	- generated_from_keras_callback
	model-index:
	- name: distilgpt_new_0100
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# distilgpt_new_0100

	This model was trained from scratch on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Train Loss: 1.0286
	- Validation Loss: 0.9952
	- Epoch: 99

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
	- training_precision: float32

	### Training results

	\| Train Loss \| Validation Loss \| Epoch \|
	\|:----------:\|:---------------:\|:-----:\|
	\| 3.5889 \| 2.6197 \| 0 \|
	\| 2.4784 \| 2.2040 \| 1 \|
	\| 2.1855 \| 1.9980 \| 2 \|
	\| 2.0181 \| 1.8643 \| 3 \|
	\| 1.9031 \| 1.7652 \| 4 \|
	\| 1.8166 \| 1.6924 \| 5 \|
	\| 1.7467 \| 1.6360 \| 6 \|
	\| 1.6904 \| 1.5843 \| 7 \|
	\| 1.6430 \| 1.5421 \| 8 \|
	\| 1.6021 \| 1.5059 \| 9 \|
	\| 1.5668 \| 1.4761 \| 10 \|
	\| 1.5359 \| 1.4481 \| 11 \|
	\| 1.5071 \| 1.4220 \| 12 \|
	\| 1.4841 \| 1.4020 \| 13 \|
	\| 1.4608 \| 1.3797 \| 14 \|
	\| 1.4399 \| 1.3595 \| 15 \|
	\| 1.4213 \| 1.3426 \| 16 \|
	\| 1.4031 \| 1.3266 \| 17 \|
	\| 1.3875 \| 1.3113 \| 18 \|
	\| 1.3735 \| 1.3024 \| 19 \|
	\| 1.3600 \| 1.2871 \| 20 \|
	\| 1.3456 \| 1.2753 \| 21 \|
	\| 1.3336 \| 1.2648 \| 22 \|
	\| 1.3214 \| 1.2539 \| 23 \|
	\| 1.3103 \| 1.2451 \| 24 \|
	\| 1.3005 \| 1.2335 \| 25 \|
	\| 1.2905 \| 1.2258 \| 26 \|
	\| 1.2815 \| 1.2179 \| 27 \|
	\| 1.2728 \| 1.2123 \| 28 \|
	\| 1.2643 \| 1.2029 \| 29 \|
	\| 1.2564 \| 1.1980 \| 30 \|
	\| 1.2494 \| 1.1877 \| 31 \|
	\| 1.2414 \| 1.1806 \| 32 \|
	\| 1.2348 \| 1.1788 \| 33 \|
	\| 1.2290 \| 1.1699 \| 34 \|
	\| 1.2209 \| 1.1654 \| 35 \|
	\| 1.2156 \| 1.1575 \| 36 \|
	\| 1.2110 \| 1.1537 \| 37 \|
	\| 1.2046 \| 1.1499 \| 38 \|
	\| 1.1986 \| 1.1436 \| 39 \|
	\| 1.1940 \| 1.1408 \| 40 \|
	\| 1.1877 \| 1.1356 \| 41 \|
	\| 1.1830 \| 1.1314 \| 42 \|
	\| 1.1779 \| 1.1278 \| 43 \|
	\| 1.1737 \| 1.1211 \| 44 \|
	\| 1.1692 \| 1.1192 \| 45 \|
	\| 1.1647 \| 1.1163 \| 46 \|
	\| 1.1611 \| 1.1107 \| 47 \|
	\| 1.1560 \| 1.1066 \| 48 \|
	\| 1.1521 \| 1.1060 \| 49 \|
	\| 1.1489 \| 1.1002 \| 50 \|
	\| 1.1440 \| 1.0960 \| 51 \|
	\| 1.1406 \| 1.0931 \| 52 \|
	\| 1.1373 \| 1.0897 \| 53 \|
	\| 1.1329 \| 1.0855 \| 54 \|
	\| 1.1302 \| 1.0842 \| 55 \|
	\| 1.1265 \| 1.0818 \| 56 \|
	\| 1.1237 \| 1.0784 \| 57 \|
	\| 1.1204 \| 1.0737 \| 58 \|
	\| 1.1173 \| 1.0714 \| 59 \|
	\| 1.1140 \| 1.0694 \| 60 \|
	\| 1.1112 \| 1.0691 \| 61 \|
	\| 1.1083 \| 1.0668 \| 62 \|
	\| 1.1044 \| 1.0611 \| 63 \|
	\| 1.1027 \| 1.0607 \| 64 \|
	\| 1.0990 \| 1.0586 \| 65 \|
	\| 1.0969 \| 1.0545 \| 66 \|
	\| 1.0944 \| 1.0522 \| 67 \|
	\| 1.0921 \| 1.0517 \| 68 \|
	\| 1.0891 \| 1.0496 \| 69 \|
	\| 1.0862 \| 1.0457 \| 70 \|
	\| 1.0828 \| 1.0448 \| 71 \|
	\| 1.0824 \| 1.0439 \| 72 \|
	\| 1.0793 \| 1.0389 \| 73 \|
	\| 1.0769 \| 1.0375 \| 74 \|
	\| 1.0740 \| 1.0362 \| 75 \|
	\| 1.0717 \| 1.0358 \| 76 \|
	\| 1.0700 \| 1.0299 \| 77 \|
	\| 1.0675 \| 1.0312 \| 78 \|
	\| 1.0639 \| 1.0288 \| 79 \|
	\| 1.0643 \| 1.0270 \| 80 \|
	\| 1.0607 \| 1.0258 \| 81 \|
	\| 1.0602 \| 1.0233 \| 82 \|
	\| 1.0568 \| 1.0225 \| 83 \|
	\| 1.0557 \| 1.0198 \| 84 \|
	\| 1.0534 \| 1.0179 \| 85 \|
	\| 1.0512 \| 1.0165 \| 86 \|
	\| 1.0495 \| 1.0170 \| 87 \|
	\| 1.0478 \| 1.0124 \| 88 \|
	\| 1.0458 \| 1.0134 \| 89 \|
	\| 1.0439 \| 1.0104 \| 90 \|
	\| 1.0418 \| 1.0092 \| 91 \|
	\| 1.0401 \| 1.0057 \| 92 \|
	\| 1.0377 \| 1.0035 \| 93 \|
	\| 1.0370 \| 1.0037 \| 94 \|
	\| 1.0345 \| 1.0029 \| 95 \|
	\| 1.0339 \| 1.0014 \| 96 \|
	\| 1.0322 \| 1.0016 \| 97 \|
	\| 1.0296 \| 0.9986 \| 98 \|
	\| 1.0286 \| 0.9952 \| 99 \|


	### Framework versions

	- Transformers 4.20.1
	- TensorFlow 2.8.2
	- Datasets 2.3.2
	- Tokenizers 0.12.1

	---
	tags:
	- generated_from_keras_callback
	model-index:
	- name: distilgpt_new_0100
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# distilgpt_new_0100

	This model was trained from scratch on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Train Loss: 1.0286
	- Validation Loss: 0.9952
	- Epoch: 99

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
	- training_precision: float32

	### Training results

	\| Train Loss \| Validation Loss \| Epoch \|
	\|:----------:\|:---------------:\|:-----:\|
	\| 3.5889 \| 2.6197 \| 0 \|
	\| 2.4784 \| 2.2040 \| 1 \|
	\| 2.1855 \| 1.9980 \| 2 \|
	\| 2.0181 \| 1.8643 \| 3 \|
	\| 1.9031 \| 1.7652 \| 4 \|
	\| 1.8166 \| 1.6924 \| 5 \|
	\| 1.7467 \| 1.6360 \| 6 \|
	\| 1.6904 \| 1.5843 \| 7 \|
	\| 1.6430 \| 1.5421 \| 8 \|
	\| 1.6021 \| 1.5059 \| 9 \|
	\| 1.5668 \| 1.4761 \| 10 \|
	\| 1.5359 \| 1.4481 \| 11 \|
	\| 1.5071 \| 1.4220 \| 12 \|
	\| 1.4841 \| 1.4020 \| 13 \|
	\| 1.4608 \| 1.3797 \| 14 \|
	\| 1.4399 \| 1.3595 \| 15 \|
	\| 1.4213 \| 1.3426 \| 16 \|
	\| 1.4031 \| 1.3266 \| 17 \|
	\| 1.3875 \| 1.3113 \| 18 \|
	\| 1.3735 \| 1.3024 \| 19 \|
	\| 1.3600 \| 1.2871 \| 20 \|
	\| 1.3456 \| 1.2753 \| 21 \|
	\| 1.3336 \| 1.2648 \| 22 \|
	\| 1.3214 \| 1.2539 \| 23 \|
	\| 1.3103 \| 1.2451 \| 24 \|
	\| 1.3005 \| 1.2335 \| 25 \|
	\| 1.2905 \| 1.2258 \| 26 \|
	\| 1.2815 \| 1.2179 \| 27 \|
	\| 1.2728 \| 1.2123 \| 28 \|
	\| 1.2643 \| 1.2029 \| 29 \|
	\| 1.2564 \| 1.1980 \| 30 \|
	\| 1.2494 \| 1.1877 \| 31 \|
	\| 1.2414 \| 1.1806 \| 32 \|
	\| 1.2348 \| 1.1788 \| 33 \|
	\| 1.2290 \| 1.1699 \| 34 \|
	\| 1.2209 \| 1.1654 \| 35 \|
	\| 1.2156 \| 1.1575 \| 36 \|
	\| 1.2110 \| 1.1537 \| 37 \|
	\| 1.2046 \| 1.1499 \| 38 \|
	\| 1.1986 \| 1.1436 \| 39 \|
	\| 1.1940 \| 1.1408 \| 40 \|
	\| 1.1877 \| 1.1356 \| 41 \|
	\| 1.1830 \| 1.1314 \| 42 \|
	\| 1.1779 \| 1.1278 \| 43 \|
	\| 1.1737 \| 1.1211 \| 44 \|
	\| 1.1692 \| 1.1192 \| 45 \|
	\| 1.1647 \| 1.1163 \| 46 \|
	\| 1.1611 \| 1.1107 \| 47 \|
	\| 1.1560 \| 1.1066 \| 48 \|
	\| 1.1521 \| 1.1060 \| 49 \|
	\| 1.1489 \| 1.1002 \| 50 \|
	\| 1.1440 \| 1.0960 \| 51 \|
	\| 1.1406 \| 1.0931 \| 52 \|
	\| 1.1373 \| 1.0897 \| 53 \|
	\| 1.1329 \| 1.0855 \| 54 \|
	\| 1.1302 \| 1.0842 \| 55 \|
	\| 1.1265 \| 1.0818 \| 56 \|
	\| 1.1237 \| 1.0784 \| 57 \|
	\| 1.1204 \| 1.0737 \| 58 \|
	\| 1.1173 \| 1.0714 \| 59 \|
	\| 1.1140 \| 1.0694 \| 60 \|
	\| 1.1112 \| 1.0691 \| 61 \|
	\| 1.1083 \| 1.0668 \| 62 \|
	\| 1.1044 \| 1.0611 \| 63 \|
	\| 1.1027 \| 1.0607 \| 64 \|
	\| 1.0990 \| 1.0586 \| 65 \|
	\| 1.0969 \| 1.0545 \| 66 \|
	\| 1.0944 \| 1.0522 \| 67 \|
	\| 1.0921 \| 1.0517 \| 68 \|
	\| 1.0891 \| 1.0496 \| 69 \|
	\| 1.0862 \| 1.0457 \| 70 \|
	\| 1.0828 \| 1.0448 \| 71 \|
	\| 1.0824 \| 1.0439 \| 72 \|
	\| 1.0793 \| 1.0389 \| 73 \|
	\| 1.0769 \| 1.0375 \| 74 \|
	\| 1.0740 \| 1.0362 \| 75 \|
	\| 1.0717 \| 1.0358 \| 76 \|
	\| 1.0700 \| 1.0299 \| 77 \|
	\| 1.0675 \| 1.0312 \| 78 \|
	\| 1.0639 \| 1.0288 \| 79 \|
	\| 1.0643 \| 1.0270 \| 80 \|
	\| 1.0607 \| 1.0258 \| 81 \|
	\| 1.0602 \| 1.0233 \| 82 \|
	\| 1.0568 \| 1.0225 \| 83 \|
	\| 1.0557 \| 1.0198 \| 84 \|
	\| 1.0534 \| 1.0179 \| 85 \|
	\| 1.0512 \| 1.0165 \| 86 \|
	\| 1.0495 \| 1.0170 \| 87 \|
	\| 1.0478 \| 1.0124 \| 88 \|
	\| 1.0458 \| 1.0134 \| 89 \|
	\| 1.0439 \| 1.0104 \| 90 \|
	\| 1.0418 \| 1.0092 \| 91 \|
	\| 1.0401 \| 1.0057 \| 92 \|
	\| 1.0377 \| 1.0035 \| 93 \|
	\| 1.0370 \| 1.0037 \| 94 \|
	\| 1.0345 \| 1.0029 \| 95 \|
	\| 1.0339 \| 1.0014 \| 96 \|
	\| 1.0322 \| 1.0016 \| 97 \|
	\| 1.0296 \| 0.9986 \| 98 \|
	\| 1.0286 \| 0.9952 \| 99 \|


	### Framework versions

	- Transformers 4.20.1
	- TensorFlow 2.8.2
	- Datasets 2.3.2
	- Tokenizers 0.12.1