Henry Scheible

rollback model to probed version

b57f99d over 1 year ago

5.15 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	datasets:
	- crows_pairs
	metrics:
	- accuracy
	model-index:
	- name: xlnet-base-cased_crows_pairs_finetuned
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: crows_pairs
	type: crows_pairs
	config: crows_pairs
	split: test
	args: crows_pairs
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.7119205298013245
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# xlnet-base-cased_crows_pairs_finetuned

	This model is a fine-tuned version of [xlnet-base-cased](https://huggingface.co/xlnet-base-cased) on the crows_pairs dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.5652
	- Accuracy: 0.7119

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 30

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 0.728 \| 0.53 \| 10 \| 0.6939 \| 0.4901 \|
	\| 0.6914 \| 1.05 \| 20 \| 0.6939 \| 0.4901 \|
	\| 0.705 \| 1.58 \| 30 \| 0.6925 \| 0.5066 \|
	\| 0.6993 \| 2.11 \| 40 \| 0.6949 \| 0.5066 \|
	\| 0.6979 \| 2.63 \| 50 \| 0.6996 \| 0.5066 \|
	\| 0.7152 \| 3.16 \| 60 \| 0.6940 \| 0.4901 \|
	\| 0.7158 \| 3.68 \| 70 \| 0.7007 \| 0.4934 \|
	\| 0.6968 \| 4.21 \| 80 \| 0.6999 \| 0.5066 \|
	\| 0.7164 \| 4.74 \| 90 \| 0.6977 \| 0.4934 \|
	\| 0.6698 \| 5.26 \| 100 \| 0.7079 \| 0.4536 \|
	\| 0.611 \| 5.79 \| 110 \| 0.8882 \| 0.5099 \|
	\| 0.6487 \| 6.32 \| 120 \| 0.8360 \| 0.5066 \|
	\| 0.5223 \| 6.84 \| 130 \| 0.8047 \| 0.5728 \|
	\| 0.2879 \| 7.37 \| 140 \| 1.1483 \| 0.5795 \|
	\| 0.2369 \| 7.89 \| 150 \| 1.1773 \| 0.5993 \|
	\| 0.2542 \| 8.42 \| 160 \| 0.9170 \| 0.6424 \|
	\| 0.1743 \| 8.95 \| 170 \| 1.3674 \| 0.6424 \|
	\| 0.1307 \| 9.47 \| 180 \| 1.0740 \| 0.7152 \|
	\| 0.0718 \| 10.0 \| 190 \| 1.4397 \| 0.6424 \|
	\| 0.0278 \| 10.53 \| 200 \| 1.9821 \| 0.6523 \|
	\| 0.0519 \| 11.05 \| 210 \| 1.6970 \| 0.6755 \|
	\| 0.0269 \| 11.58 \| 220 \| 1.8299 \| 0.6656 \|
	\| 0.0556 \| 12.11 \| 230 \| 1.9459 \| 0.7086 \|
	\| 0.0455 \| 12.63 \| 240 \| 1.6443 \| 0.6854 \|
	\| 0.0665 \| 13.16 \| 250 \| 1.9887 \| 0.6821 \|
	\| 0.009 \| 13.68 \| 260 \| 2.0236 \| 0.6788 \|
	\| 0.0146 \| 14.21 \| 270 \| 1.8515 \| 0.7152 \|
	\| 0.0034 \| 14.74 \| 280 \| 1.9315 \| 0.7252 \|
	\| 0.0248 \| 15.26 \| 290 \| 2.0754 \| 0.7119 \|
	\| 0.0536 \| 15.79 \| 300 \| 2.0371 \| 0.7053 \|
	\| 0.0393 \| 16.32 \| 310 \| 1.9381 \| 0.6987 \|
	\| 0.0255 \| 16.84 \| 320 \| 1.9074 \| 0.6788 \|
	\| 0.0116 \| 17.37 \| 330 \| 2.2182 \| 0.6623 \|
	\| 0.0128 \| 17.89 \| 340 \| 2.3002 \| 0.6689 \|
	\| 0.0006 \| 18.42 \| 350 \| 2.2353 \| 0.6788 \|
	\| 0.0053 \| 18.95 \| 360 \| 2.4277 \| 0.6755 \|
	\| 0.0013 \| 19.47 \| 370 \| 2.5156 \| 0.6490 \|
	\| 0.0004 \| 20.0 \| 380 \| 2.5091 \| 0.6689 \|
	\| 0.0003 \| 20.53 \| 390 \| 2.4096 \| 0.6854 \|
	\| 0.0017 \| 21.05 \| 400 \| 2.3497 \| 0.6921 \|
	\| 0.0001 \| 21.58 \| 410 \| 2.3376 \| 0.6854 \|
	\| 0.012 \| 22.11 \| 420 \| 2.3832 \| 0.6854 \|
	\| 0.0002 \| 22.63 \| 430 \| 2.4388 \| 0.7053 \|
	\| 0.0001 \| 23.16 \| 440 \| 2.4821 \| 0.7152 \|
	\| 0.0001 \| 23.68 \| 450 \| 2.5027 \| 0.7119 \|
	\| 0.0001 \| 24.21 \| 460 \| 2.5105 \| 0.7152 \|
	\| 0.0001 \| 24.74 \| 470 \| 2.5145 \| 0.7152 \|
	\| 0.0002 \| 25.26 \| 480 \| 2.5143 \| 0.6954 \|
	\| 0.0001 \| 25.79 \| 490 \| 2.5629 \| 0.6821 \|
	\| 0.0002 \| 26.32 \| 500 \| 2.5414 \| 0.6887 \|
	\| 0.0001 \| 26.84 \| 510 \| 2.5301 \| 0.7119 \|
	\| 0.0012 \| 27.37 \| 520 \| 2.5360 \| 0.7020 \|
	\| 0.0 \| 27.89 \| 530 \| 2.5428 \| 0.6921 \|
	\| 0.0117 \| 28.42 \| 540 \| 2.5455 \| 0.6954 \|
	\| 0.0001 \| 28.95 \| 550 \| 2.5598 \| 0.7086 \|
	\| 0.0001 \| 29.47 \| 560 \| 2.5648 \| 0.7119 \|
	\| 0.0001 \| 30.0 \| 570 \| 2.5652 \| 0.7119 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.13.1
	- Datasets 2.10.1
	- Tokenizers 0.13.2

	---
	license: mit
	tags:
	- generated_from_trainer
	datasets:
	- crows_pairs
	metrics:
	- accuracy
	model-index:
	- name: xlnet-base-cased_crows_pairs_finetuned
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: crows_pairs
	type: crows_pairs
	config: crows_pairs
	split: test
	args: crows_pairs
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.7119205298013245
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# xlnet-base-cased_crows_pairs_finetuned

	This model is a fine-tuned version of [xlnet-base-cased](https://huggingface.co/xlnet-base-cased) on the crows_pairs dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.5652
	- Accuracy: 0.7119

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 30

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 0.728 \| 0.53 \| 10 \| 0.6939 \| 0.4901 \|
	\| 0.6914 \| 1.05 \| 20 \| 0.6939 \| 0.4901 \|
	\| 0.705 \| 1.58 \| 30 \| 0.6925 \| 0.5066 \|
	\| 0.6993 \| 2.11 \| 40 \| 0.6949 \| 0.5066 \|
	\| 0.6979 \| 2.63 \| 50 \| 0.6996 \| 0.5066 \|
	\| 0.7152 \| 3.16 \| 60 \| 0.6940 \| 0.4901 \|
	\| 0.7158 \| 3.68 \| 70 \| 0.7007 \| 0.4934 \|
	\| 0.6968 \| 4.21 \| 80 \| 0.6999 \| 0.5066 \|
	\| 0.7164 \| 4.74 \| 90 \| 0.6977 \| 0.4934 \|
	\| 0.6698 \| 5.26 \| 100 \| 0.7079 \| 0.4536 \|
	\| 0.611 \| 5.79 \| 110 \| 0.8882 \| 0.5099 \|
	\| 0.6487 \| 6.32 \| 120 \| 0.8360 \| 0.5066 \|
	\| 0.5223 \| 6.84 \| 130 \| 0.8047 \| 0.5728 \|
	\| 0.2879 \| 7.37 \| 140 \| 1.1483 \| 0.5795 \|
	\| 0.2369 \| 7.89 \| 150 \| 1.1773 \| 0.5993 \|
	\| 0.2542 \| 8.42 \| 160 \| 0.9170 \| 0.6424 \|
	\| 0.1743 \| 8.95 \| 170 \| 1.3674 \| 0.6424 \|
	\| 0.1307 \| 9.47 \| 180 \| 1.0740 \| 0.7152 \|
	\| 0.0718 \| 10.0 \| 190 \| 1.4397 \| 0.6424 \|
	\| 0.0278 \| 10.53 \| 200 \| 1.9821 \| 0.6523 \|
	\| 0.0519 \| 11.05 \| 210 \| 1.6970 \| 0.6755 \|
	\| 0.0269 \| 11.58 \| 220 \| 1.8299 \| 0.6656 \|
	\| 0.0556 \| 12.11 \| 230 \| 1.9459 \| 0.7086 \|
	\| 0.0455 \| 12.63 \| 240 \| 1.6443 \| 0.6854 \|
	\| 0.0665 \| 13.16 \| 250 \| 1.9887 \| 0.6821 \|
	\| 0.009 \| 13.68 \| 260 \| 2.0236 \| 0.6788 \|
	\| 0.0146 \| 14.21 \| 270 \| 1.8515 \| 0.7152 \|
	\| 0.0034 \| 14.74 \| 280 \| 1.9315 \| 0.7252 \|
	\| 0.0248 \| 15.26 \| 290 \| 2.0754 \| 0.7119 \|
	\| 0.0536 \| 15.79 \| 300 \| 2.0371 \| 0.7053 \|
	\| 0.0393 \| 16.32 \| 310 \| 1.9381 \| 0.6987 \|
	\| 0.0255 \| 16.84 \| 320 \| 1.9074 \| 0.6788 \|
	\| 0.0116 \| 17.37 \| 330 \| 2.2182 \| 0.6623 \|
	\| 0.0128 \| 17.89 \| 340 \| 2.3002 \| 0.6689 \|
	\| 0.0006 \| 18.42 \| 350 \| 2.2353 \| 0.6788 \|
	\| 0.0053 \| 18.95 \| 360 \| 2.4277 \| 0.6755 \|
	\| 0.0013 \| 19.47 \| 370 \| 2.5156 \| 0.6490 \|
	\| 0.0004 \| 20.0 \| 380 \| 2.5091 \| 0.6689 \|
	\| 0.0003 \| 20.53 \| 390 \| 2.4096 \| 0.6854 \|
	\| 0.0017 \| 21.05 \| 400 \| 2.3497 \| 0.6921 \|
	\| 0.0001 \| 21.58 \| 410 \| 2.3376 \| 0.6854 \|
	\| 0.012 \| 22.11 \| 420 \| 2.3832 \| 0.6854 \|
	\| 0.0002 \| 22.63 \| 430 \| 2.4388 \| 0.7053 \|
	\| 0.0001 \| 23.16 \| 440 \| 2.4821 \| 0.7152 \|
	\| 0.0001 \| 23.68 \| 450 \| 2.5027 \| 0.7119 \|
	\| 0.0001 \| 24.21 \| 460 \| 2.5105 \| 0.7152 \|
	\| 0.0001 \| 24.74 \| 470 \| 2.5145 \| 0.7152 \|
	\| 0.0002 \| 25.26 \| 480 \| 2.5143 \| 0.6954 \|
	\| 0.0001 \| 25.79 \| 490 \| 2.5629 \| 0.6821 \|
	\| 0.0002 \| 26.32 \| 500 \| 2.5414 \| 0.6887 \|
	\| 0.0001 \| 26.84 \| 510 \| 2.5301 \| 0.7119 \|
	\| 0.0012 \| 27.37 \| 520 \| 2.5360 \| 0.7020 \|
	\| 0.0 \| 27.89 \| 530 \| 2.5428 \| 0.6921 \|
	\| 0.0117 \| 28.42 \| 540 \| 2.5455 \| 0.6954 \|
	\| 0.0001 \| 28.95 \| 550 \| 2.5598 \| 0.7086 \|
	\| 0.0001 \| 29.47 \| 560 \| 2.5648 \| 0.7119 \|
	\| 0.0001 \| 30.0 \| 570 \| 2.5652 \| 0.7119 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.13.1
	- Datasets 2.10.1
	- Tokenizers 0.13.2