Dan Semin

update model card README.md

a9ce864 over 1 year ago

6.2 kB

	---
	license: apache-2.0
	tags:
	- text-classification
	- generated_from_trainer
	datasets:
	- xnli
	metrics:
	- accuracy
	model-index:
	- name: xnli_m_bert_only_en_single_gpu
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: xnli
	type: xnli
	config: en
	split: train
	args: en
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.810843373493976
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# xnli_m_bert_only_en_single_gpu

	This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on the xnli dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5306
	- Accuracy: 0.8108

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:--------:\|
	\| 0.8884 \| 0.04 \| 1000 \| 0.7743 \| 0.6703 \|
	\| 0.782 \| 0.08 \| 2000 \| 0.7029 \| 0.7060 \|
	\| 0.7479 \| 0.12 \| 3000 \| 0.7366 \| 0.6880 \|
	\| 0.7348 \| 0.16 \| 4000 \| 0.6722 \| 0.7285 \|
	\| 0.721 \| 0.2 \| 5000 \| 0.6802 \| 0.7237 \|
	\| 0.7097 \| 0.24 \| 6000 \| 0.6801 \| 0.7217 \|
	\| 0.6978 \| 0.29 \| 7000 \| 0.6051 \| 0.7643 \|
	\| 0.6924 \| 0.33 \| 8000 \| 0.6793 \| 0.7357 \|
	\| 0.6807 \| 0.37 \| 9000 \| 0.6604 \| 0.7502 \|
	\| 0.6636 \| 0.41 \| 10000 \| 0.6309 \| 0.7430 \|
	\| 0.6616 \| 0.45 \| 11000 \| 0.6039 \| 0.7490 \|
	\| 0.6561 \| 0.49 \| 12000 \| 0.6051 \| 0.7610 \|
	\| 0.6545 \| 0.53 \| 13000 \| 0.6354 \| 0.7454 \|
	\| 0.644 \| 0.57 \| 14000 \| 0.6064 \| 0.7466 \|
	\| 0.6446 \| 0.61 \| 15000 \| 0.6052 \| 0.7554 \|
	\| 0.6414 \| 0.65 \| 16000 \| 0.6365 \| 0.7422 \|
	\| 0.6311 \| 0.69 \| 17000 \| 0.6118 \| 0.7546 \|
	\| 0.6187 \| 0.73 \| 18000 \| 0.5973 \| 0.7538 \|
	\| 0.619 \| 0.77 \| 19000 \| 0.5863 \| 0.7570 \|
	\| 0.6108 \| 0.81 \| 20000 \| 0.6212 \| 0.7490 \|
	\| 0.6136 \| 0.86 \| 21000 \| 0.5810 \| 0.7695 \|
	\| 0.6018 \| 0.9 \| 22000 \| 0.5799 \| 0.7731 \|
	\| 0.6198 \| 0.94 \| 23000 \| 0.5548 \| 0.7723 \|
	\| 0.6047 \| 0.98 \| 24000 \| 0.5964 \| 0.7622 \|
	\| 0.5636 \| 1.02 \| 25000 \| 0.5805 \| 0.7851 \|
	\| 0.5267 \| 1.06 \| 26000 \| 0.5540 \| 0.7795 \|
	\| 0.5067 \| 1.1 \| 27000 \| 0.5388 \| 0.7855 \|
	\| 0.5304 \| 1.14 \| 28000 \| 0.5482 \| 0.7799 \|
	\| 0.5332 \| 1.18 \| 29000 \| 0.5290 \| 0.7859 \|
	\| 0.5154 \| 1.22 \| 30000 \| 0.5475 \| 0.7799 \|
	\| 0.524 \| 1.26 \| 31000 \| 0.5305 \| 0.7900 \|
	\| 0.5236 \| 1.3 \| 32000 \| 0.5691 \| 0.7871 \|
	\| 0.5154 \| 1.34 \| 33000 \| 0.5642 \| 0.7739 \|
	\| 0.5248 \| 1.39 \| 34000 \| 0.5590 \| 0.7643 \|
	\| 0.5077 \| 1.43 \| 35000 \| 0.6064 \| 0.7715 \|
	\| 0.5147 \| 1.47 \| 36000 \| 0.5343 \| 0.7948 \|
	\| 0.5041 \| 1.51 \| 37000 \| 0.5375 \| 0.7867 \|
	\| 0.5054 \| 1.55 \| 38000 \| 0.5660 \| 0.7727 \|
	\| 0.5053 \| 1.59 \| 39000 \| 0.5479 \| 0.7859 \|
	\| 0.5009 \| 1.63 \| 40000 \| 0.5080 \| 0.7960 \|
	\| 0.5081 \| 1.67 \| 41000 \| 0.5139 \| 0.7920 \|
	\| 0.5013 \| 1.71 \| 42000 \| 0.5385 \| 0.7904 \|
	\| 0.4972 \| 1.75 \| 43000 \| 0.5257 \| 0.7928 \|
	\| 0.4987 \| 1.79 \| 44000 \| 0.5056 \| 0.8020 \|
	\| 0.4863 \| 1.83 \| 45000 \| 0.5030 \| 0.8004 \|
	\| 0.5 \| 1.87 \| 46000 \| 0.5157 \| 0.7980 \|
	\| 0.4926 \| 1.91 \| 47000 \| 0.5505 \| 0.7924 \|
	\| 0.4893 \| 1.96 \| 48000 \| 0.5286 \| 0.8004 \|
	\| 0.4755 \| 2.0 \| 49000 \| 0.5216 \| 0.8036 \|
	\| 0.3855 \| 2.04 \| 50000 \| 0.6087 \| 0.7884 \|
	\| 0.3731 \| 2.08 \| 51000 \| 0.5485 \| 0.8064 \|
	\| 0.3698 \| 2.12 \| 52000 \| 0.5398 \| 0.8080 \|
	\| 0.3702 \| 2.16 \| 53000 \| 0.5454 \| 0.8 \|
	\| 0.3688 \| 2.2 \| 54000 \| 0.5512 \| 0.8068 \|
	\| 0.3683 \| 2.24 \| 55000 \| 0.5423 \| 0.8060 \|
	\| 0.3704 \| 2.28 \| 56000 \| 0.5383 \| 0.8084 \|
	\| 0.3758 \| 2.32 \| 57000 \| 0.5398 \| 0.8161 \|
	\| 0.3781 \| 2.36 \| 58000 \| 0.5481 \| 0.8088 \|
	\| 0.3697 \| 2.4 \| 59000 \| 0.5465 \| 0.8056 \|
	\| 0.3706 \| 2.44 \| 60000 \| 0.5488 \| 0.7988 \|
	\| 0.3704 \| 2.49 \| 61000 \| 0.5395 \| 0.8052 \|
	\| 0.3648 \| 2.53 \| 62000 \| 0.5463 \| 0.8068 \|
	\| 0.36 \| 2.57 \| 63000 \| 0.5400 \| 0.8052 \|
	\| 0.3661 \| 2.61 \| 64000 \| 0.5542 \| 0.8068 \|
	\| 0.3555 \| 2.65 \| 65000 \| 0.5424 \| 0.8044 \|
	\| 0.3551 \| 2.69 \| 66000 \| 0.5269 \| 0.8124 \|
	\| 0.3608 \| 2.73 \| 67000 \| 0.5382 \| 0.8129 \|
	\| 0.35 \| 2.77 \| 68000 \| 0.5461 \| 0.8108 \|
	\| 0.3457 \| 2.81 \| 69000 \| 0.5477 \| 0.8084 \|
	\| 0.3516 \| 2.85 \| 70000 \| 0.5345 \| 0.8104 \|
	\| 0.3499 \| 2.89 \| 71000 \| 0.5344 \| 0.8129 \|
	\| 0.3513 \| 2.93 \| 72000 \| 0.5279 \| 0.8120 \|
	\| 0.3442 \| 2.97 \| 73000 \| 0.5306 \| 0.8108 \|


	### Framework versions

	- Transformers 4.24.0
	- Pytorch 1.13.0
	- Datasets 2.6.1
	- Tokenizers 0.13.1

	---
	license: apache-2.0
	tags:
	- text-classification
	- generated_from_trainer
	datasets:
	- xnli
	metrics:
	- accuracy
	model-index:
	- name: xnli_m_bert_only_en_single_gpu
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: xnli
	type: xnli
	config: en
	split: train
	args: en
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.810843373493976
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# xnli_m_bert_only_en_single_gpu

	This model is a fine-tuned version of [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) on the xnli dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.5306
	- Accuracy: 0.8108

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:--------:\|
	\| 0.8884 \| 0.04 \| 1000 \| 0.7743 \| 0.6703 \|
	\| 0.782 \| 0.08 \| 2000 \| 0.7029 \| 0.7060 \|
	\| 0.7479 \| 0.12 \| 3000 \| 0.7366 \| 0.6880 \|
	\| 0.7348 \| 0.16 \| 4000 \| 0.6722 \| 0.7285 \|
	\| 0.721 \| 0.2 \| 5000 \| 0.6802 \| 0.7237 \|
	\| 0.7097 \| 0.24 \| 6000 \| 0.6801 \| 0.7217 \|
	\| 0.6978 \| 0.29 \| 7000 \| 0.6051 \| 0.7643 \|
	\| 0.6924 \| 0.33 \| 8000 \| 0.6793 \| 0.7357 \|
	\| 0.6807 \| 0.37 \| 9000 \| 0.6604 \| 0.7502 \|
	\| 0.6636 \| 0.41 \| 10000 \| 0.6309 \| 0.7430 \|
	\| 0.6616 \| 0.45 \| 11000 \| 0.6039 \| 0.7490 \|
	\| 0.6561 \| 0.49 \| 12000 \| 0.6051 \| 0.7610 \|
	\| 0.6545 \| 0.53 \| 13000 \| 0.6354 \| 0.7454 \|
	\| 0.644 \| 0.57 \| 14000 \| 0.6064 \| 0.7466 \|
	\| 0.6446 \| 0.61 \| 15000 \| 0.6052 \| 0.7554 \|
	\| 0.6414 \| 0.65 \| 16000 \| 0.6365 \| 0.7422 \|
	\| 0.6311 \| 0.69 \| 17000 \| 0.6118 \| 0.7546 \|
	\| 0.6187 \| 0.73 \| 18000 \| 0.5973 \| 0.7538 \|
	\| 0.619 \| 0.77 \| 19000 \| 0.5863 \| 0.7570 \|
	\| 0.6108 \| 0.81 \| 20000 \| 0.6212 \| 0.7490 \|
	\| 0.6136 \| 0.86 \| 21000 \| 0.5810 \| 0.7695 \|
	\| 0.6018 \| 0.9 \| 22000 \| 0.5799 \| 0.7731 \|
	\| 0.6198 \| 0.94 \| 23000 \| 0.5548 \| 0.7723 \|
	\| 0.6047 \| 0.98 \| 24000 \| 0.5964 \| 0.7622 \|
	\| 0.5636 \| 1.02 \| 25000 \| 0.5805 \| 0.7851 \|
	\| 0.5267 \| 1.06 \| 26000 \| 0.5540 \| 0.7795 \|
	\| 0.5067 \| 1.1 \| 27000 \| 0.5388 \| 0.7855 \|
	\| 0.5304 \| 1.14 \| 28000 \| 0.5482 \| 0.7799 \|
	\| 0.5332 \| 1.18 \| 29000 \| 0.5290 \| 0.7859 \|
	\| 0.5154 \| 1.22 \| 30000 \| 0.5475 \| 0.7799 \|
	\| 0.524 \| 1.26 \| 31000 \| 0.5305 \| 0.7900 \|
	\| 0.5236 \| 1.3 \| 32000 \| 0.5691 \| 0.7871 \|
	\| 0.5154 \| 1.34 \| 33000 \| 0.5642 \| 0.7739 \|
	\| 0.5248 \| 1.39 \| 34000 \| 0.5590 \| 0.7643 \|
	\| 0.5077 \| 1.43 \| 35000 \| 0.6064 \| 0.7715 \|
	\| 0.5147 \| 1.47 \| 36000 \| 0.5343 \| 0.7948 \|
	\| 0.5041 \| 1.51 \| 37000 \| 0.5375 \| 0.7867 \|
	\| 0.5054 \| 1.55 \| 38000 \| 0.5660 \| 0.7727 \|
	\| 0.5053 \| 1.59 \| 39000 \| 0.5479 \| 0.7859 \|
	\| 0.5009 \| 1.63 \| 40000 \| 0.5080 \| 0.7960 \|
	\| 0.5081 \| 1.67 \| 41000 \| 0.5139 \| 0.7920 \|
	\| 0.5013 \| 1.71 \| 42000 \| 0.5385 \| 0.7904 \|
	\| 0.4972 \| 1.75 \| 43000 \| 0.5257 \| 0.7928 \|
	\| 0.4987 \| 1.79 \| 44000 \| 0.5056 \| 0.8020 \|
	\| 0.4863 \| 1.83 \| 45000 \| 0.5030 \| 0.8004 \|
	\| 0.5 \| 1.87 \| 46000 \| 0.5157 \| 0.7980 \|
	\| 0.4926 \| 1.91 \| 47000 \| 0.5505 \| 0.7924 \|
	\| 0.4893 \| 1.96 \| 48000 \| 0.5286 \| 0.8004 \|
	\| 0.4755 \| 2.0 \| 49000 \| 0.5216 \| 0.8036 \|
	\| 0.3855 \| 2.04 \| 50000 \| 0.6087 \| 0.7884 \|
	\| 0.3731 \| 2.08 \| 51000 \| 0.5485 \| 0.8064 \|
	\| 0.3698 \| 2.12 \| 52000 \| 0.5398 \| 0.8080 \|
	\| 0.3702 \| 2.16 \| 53000 \| 0.5454 \| 0.8 \|
	\| 0.3688 \| 2.2 \| 54000 \| 0.5512 \| 0.8068 \|
	\| 0.3683 \| 2.24 \| 55000 \| 0.5423 \| 0.8060 \|
	\| 0.3704 \| 2.28 \| 56000 \| 0.5383 \| 0.8084 \|
	\| 0.3758 \| 2.32 \| 57000 \| 0.5398 \| 0.8161 \|
	\| 0.3781 \| 2.36 \| 58000 \| 0.5481 \| 0.8088 \|
	\| 0.3697 \| 2.4 \| 59000 \| 0.5465 \| 0.8056 \|
	\| 0.3706 \| 2.44 \| 60000 \| 0.5488 \| 0.7988 \|
	\| 0.3704 \| 2.49 \| 61000 \| 0.5395 \| 0.8052 \|
	\| 0.3648 \| 2.53 \| 62000 \| 0.5463 \| 0.8068 \|
	\| 0.36 \| 2.57 \| 63000 \| 0.5400 \| 0.8052 \|
	\| 0.3661 \| 2.61 \| 64000 \| 0.5542 \| 0.8068 \|
	\| 0.3555 \| 2.65 \| 65000 \| 0.5424 \| 0.8044 \|
	\| 0.3551 \| 2.69 \| 66000 \| 0.5269 \| 0.8124 \|
	\| 0.3608 \| 2.73 \| 67000 \| 0.5382 \| 0.8129 \|
	\| 0.35 \| 2.77 \| 68000 \| 0.5461 \| 0.8108 \|
	\| 0.3457 \| 2.81 \| 69000 \| 0.5477 \| 0.8084 \|
	\| 0.3516 \| 2.85 \| 70000 \| 0.5345 \| 0.8104 \|
	\| 0.3499 \| 2.89 \| 71000 \| 0.5344 \| 0.8129 \|
	\| 0.3513 \| 2.93 \| 72000 \| 0.5279 \| 0.8120 \|
	\| 0.3442 \| 2.97 \| 73000 \| 0.5306 \| 0.8108 \|


	### Framework versions

	- Transformers 4.24.0
	- Pytorch 1.13.0
	- Datasets 2.6.1
	- Tokenizers 0.13.1