kaz_legal_bert_5 / README.md

update model card README.md

e41fa8a about 1 year ago

4.28 kB

	---
	tags:
	- generated_from_trainer
	model-index:
	- name: kaz_legal_bert_5
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# kaz_legal_bert_5

	This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 4.8262

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 8.1161 \| 0.08 \| 1000 \| 7.7116 \|
	\| 7.6258 \| 0.17 \| 2000 \| 7.4606 \|
	\| 7.4268 \| 0.25 \| 3000 \| 7.3184 \|
	\| 7.2837 \| 0.34 \| 4000 \| 7.2020 \|
	\| 7.1969 \| 0.42 \| 5000 \| 7.1236 \|
	\| 7.1201 \| 0.5 \| 6000 \| 7.0599 \|
	\| 7.0683 \| 0.59 \| 7000 \| 6.9990 \|
	\| 6.9956 \| 0.67 \| 8000 \| 6.9369 \|
	\| 6.9392 \| 0.76 \| 9000 \| 6.8828 \|
	\| 6.8949 \| 0.84 \| 10000 \| 6.8263 \|
	\| 6.8437 \| 0.92 \| 11000 \| 6.7913 \|
	\| 6.8027 \| 1.01 \| 12000 \| 6.7392 \|
	\| 6.7539 \| 1.09 \| 13000 \| 6.7010 \|
	\| 6.7316 \| 1.18 \| 14000 \| 6.6663 \|
	\| 6.6853 \| 1.26 \| 15000 \| 6.6338 \|
	\| 6.6449 \| 1.34 \| 16000 \| 6.6004 \|
	\| 6.6188 \| 1.43 \| 17000 \| 6.5463 \|
	\| 6.5831 \| 1.51 \| 18000 \| 6.5042 \|
	\| 6.5498 \| 1.6 \| 19000 \| 6.4581 \|
	\| 6.5116 \| 1.68 \| 20000 \| 6.4205 \|
	\| 6.4579 \| 1.77 \| 21000 \| 6.3473 \|
	\| 6.3996 \| 1.85 \| 22000 \| 6.2794 \|
	\| 6.3358 \| 1.93 \| 23000 \| 6.2082 \|
	\| 6.2827 \| 2.02 \| 24000 \| 6.1448 \|
	\| 6.2381 \| 2.1 \| 25000 \| 6.0923 \|
	\| 6.1947 \| 2.19 \| 26000 \| 6.0460 \|
	\| 6.1479 \| 2.27 \| 27000 \| 6.0002 \|
	\| 6.1095 \| 2.35 \| 28000 \| 5.9537 \|
	\| 6.0669 \| 2.44 \| 29000 \| 5.9139 \|
	\| 6.0411 \| 2.52 \| 30000 \| 5.8827 \|
	\| 6.0081 \| 2.61 \| 31000 \| 5.8454 \|
	\| 5.9939 \| 2.69 \| 32000 \| 5.8276 \|
	\| 5.9714 \| 2.77 \| 33000 \| 5.8060 \|
	\| 5.9524 \| 2.86 \| 34000 \| 5.7878 \|
	\| 5.9357 \| 2.94 \| 35000 \| 5.7772 \|
	\| 5.9705 \| 3.03 \| 36000 \| 5.7964 \|
	\| 5.9276 \| 3.11 \| 37000 \| 5.7410 \|
	\| 5.8802 \| 3.19 \| 38000 \| 5.6813 \|
	\| 5.8342 \| 3.28 \| 39000 \| 5.6268 \|
	\| 5.786 \| 3.36 \| 40000 \| 5.5729 \|
	\| 5.7328 \| 3.45 \| 41000 \| 5.5030 \|
	\| 5.6604 \| 3.53 \| 42000 \| 5.4495 \|
	\| 5.6102 \| 3.61 \| 43000 \| 5.3746 \|
	\| 5.5296 \| 3.7 \| 44000 \| 5.3149 \|
	\| 5.4876 \| 3.78 \| 45000 \| 5.2536 \|
	\| 5.417 \| 3.87 \| 46000 \| 5.2004 \|
	\| 5.3665 \| 3.95 \| 47000 \| 5.1488 \|
	\| 5.3131 \| 4.03 \| 48000 \| 5.0948 \|
	\| 5.2697 \| 4.12 \| 49000 \| 5.0538 \|
	\| 5.2307 \| 4.2 \| 50000 \| 5.0139 \|
	\| 5.1975 \| 4.29 \| 51000 \| 4.9757 \|
	\| 5.1506 \| 4.37 \| 52000 \| 4.9439 \|
	\| 5.1285 \| 4.45 \| 53000 \| 4.9238 \|
	\| 5.1009 \| 4.54 \| 54000 \| 4.8900 \|
	\| 5.072 \| 4.62 \| 55000 \| 4.8735 \|
	\| 5.0533 \| 4.71 \| 56000 \| 4.8548 \|
	\| 5.0412 \| 4.79 \| 57000 \| 4.8421 \|
	\| 5.0319 \| 4.88 \| 58000 \| 4.8326 \|
	\| 5.0119 \| 4.96 \| 59000 \| 4.8262 \|


	### Framework versions

	- Transformers 4.28.1
	- Pytorch 2.0.0+cu118
	- Datasets 2.11.0
	- Tokenizers 0.13.3

	---
	tags:
	- generated_from_trainer
	model-index:
	- name: kaz_legal_bert_5
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# kaz_legal_bert_5

	This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 4.8262

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- gradient_accumulation_steps: 8
	- total_train_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 8.1161 \| 0.08 \| 1000 \| 7.7116 \|
	\| 7.6258 \| 0.17 \| 2000 \| 7.4606 \|
	\| 7.4268 \| 0.25 \| 3000 \| 7.3184 \|
	\| 7.2837 \| 0.34 \| 4000 \| 7.2020 \|
	\| 7.1969 \| 0.42 \| 5000 \| 7.1236 \|
	\| 7.1201 \| 0.5 \| 6000 \| 7.0599 \|
	\| 7.0683 \| 0.59 \| 7000 \| 6.9990 \|
	\| 6.9956 \| 0.67 \| 8000 \| 6.9369 \|
	\| 6.9392 \| 0.76 \| 9000 \| 6.8828 \|
	\| 6.8949 \| 0.84 \| 10000 \| 6.8263 \|
	\| 6.8437 \| 0.92 \| 11000 \| 6.7913 \|
	\| 6.8027 \| 1.01 \| 12000 \| 6.7392 \|
	\| 6.7539 \| 1.09 \| 13000 \| 6.7010 \|
	\| 6.7316 \| 1.18 \| 14000 \| 6.6663 \|
	\| 6.6853 \| 1.26 \| 15000 \| 6.6338 \|
	\| 6.6449 \| 1.34 \| 16000 \| 6.6004 \|
	\| 6.6188 \| 1.43 \| 17000 \| 6.5463 \|
	\| 6.5831 \| 1.51 \| 18000 \| 6.5042 \|
	\| 6.5498 \| 1.6 \| 19000 \| 6.4581 \|
	\| 6.5116 \| 1.68 \| 20000 \| 6.4205 \|
	\| 6.4579 \| 1.77 \| 21000 \| 6.3473 \|
	\| 6.3996 \| 1.85 \| 22000 \| 6.2794 \|
	\| 6.3358 \| 1.93 \| 23000 \| 6.2082 \|
	\| 6.2827 \| 2.02 \| 24000 \| 6.1448 \|
	\| 6.2381 \| 2.1 \| 25000 \| 6.0923 \|
	\| 6.1947 \| 2.19 \| 26000 \| 6.0460 \|
	\| 6.1479 \| 2.27 \| 27000 \| 6.0002 \|
	\| 6.1095 \| 2.35 \| 28000 \| 5.9537 \|
	\| 6.0669 \| 2.44 \| 29000 \| 5.9139 \|
	\| 6.0411 \| 2.52 \| 30000 \| 5.8827 \|
	\| 6.0081 \| 2.61 \| 31000 \| 5.8454 \|
	\| 5.9939 \| 2.69 \| 32000 \| 5.8276 \|
	\| 5.9714 \| 2.77 \| 33000 \| 5.8060 \|
	\| 5.9524 \| 2.86 \| 34000 \| 5.7878 \|
	\| 5.9357 \| 2.94 \| 35000 \| 5.7772 \|
	\| 5.9705 \| 3.03 \| 36000 \| 5.7964 \|
	\| 5.9276 \| 3.11 \| 37000 \| 5.7410 \|
	\| 5.8802 \| 3.19 \| 38000 \| 5.6813 \|
	\| 5.8342 \| 3.28 \| 39000 \| 5.6268 \|
	\| 5.786 \| 3.36 \| 40000 \| 5.5729 \|
	\| 5.7328 \| 3.45 \| 41000 \| 5.5030 \|
	\| 5.6604 \| 3.53 \| 42000 \| 5.4495 \|
	\| 5.6102 \| 3.61 \| 43000 \| 5.3746 \|
	\| 5.5296 \| 3.7 \| 44000 \| 5.3149 \|
	\| 5.4876 \| 3.78 \| 45000 \| 5.2536 \|
	\| 5.417 \| 3.87 \| 46000 \| 5.2004 \|
	\| 5.3665 \| 3.95 \| 47000 \| 5.1488 \|
	\| 5.3131 \| 4.03 \| 48000 \| 5.0948 \|
	\| 5.2697 \| 4.12 \| 49000 \| 5.0538 \|
	\| 5.2307 \| 4.2 \| 50000 \| 5.0139 \|
	\| 5.1975 \| 4.29 \| 51000 \| 4.9757 \|
	\| 5.1506 \| 4.37 \| 52000 \| 4.9439 \|
	\| 5.1285 \| 4.45 \| 53000 \| 4.9238 \|
	\| 5.1009 \| 4.54 \| 54000 \| 4.8900 \|
	\| 5.072 \| 4.62 \| 55000 \| 4.8735 \|
	\| 5.0533 \| 4.71 \| 56000 \| 4.8548 \|
	\| 5.0412 \| 4.79 \| 57000 \| 4.8421 \|
	\| 5.0319 \| 4.88 \| 58000 \| 4.8326 \|
	\| 5.0119 \| 4.96 \| 59000 \| 4.8262 \|


	### Framework versions

	- Transformers 4.28.1
	- Pytorch 2.0.0+cu118
	- Datasets 2.11.0
	- Tokenizers 0.13.3