2_5e-3_10_0.5 / README.md

Onutoa

update model card README.md

87337e0 about 1 year ago

preview code

raw

history blame contribute delete

No virus

5.07 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets:
	- super_glue
	metrics:
	- accuracy
	model-index:
	- name: 2_5e-3_10_0.5
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# 2_5e-3_10_0.5

	This model is a fine-tuned version of [bert-large-uncased](https://huggingface.co/bert-large-uncased) on the super_glue dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8743
	- Accuracy: 0.7407

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.005
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 11
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 60.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:--------:\|
	\| 2.0951 \| 1.0 \| 590 \| 2.8478 \| 0.6208 \|
	\| 2.0966 \| 2.0 \| 1180 \| 2.0402 \| 0.6208 \|
	\| 1.9864 \| 3.0 \| 1770 \| 2.9563 \| 0.4196 \|
	\| 1.9962 \| 4.0 \| 2360 \| 2.4148 \| 0.4905 \|
	\| 1.8743 \| 5.0 \| 2950 \| 2.1057 \| 0.6217 \|
	\| 1.562 \| 6.0 \| 3540 \| 1.6253 \| 0.6636 \|
	\| 1.4913 \| 7.0 \| 4130 \| 1.4832 \| 0.6734 \|
	\| 1.4114 \| 8.0 \| 4720 \| 1.4386 \| 0.6560 \|
	\| 1.3732 \| 9.0 \| 5310 \| 1.4139 \| 0.6508 \|
	\| 1.3161 \| 10.0 \| 5900 \| 1.3009 \| 0.6893 \|
	\| 1.2979 \| 11.0 \| 6490 \| 1.2760 \| 0.6963 \|
	\| 1.1837 \| 12.0 \| 7080 \| 1.2606 \| 0.6737 \|
	\| 1.2171 \| 13.0 \| 7670 \| 1.2241 \| 0.7040 \|
	\| 1.1545 \| 14.0 \| 8260 \| 1.2533 \| 0.7086 \|
	\| 1.1424 \| 15.0 \| 8850 \| 1.1613 \| 0.7061 \|
	\| 1.1106 \| 16.0 \| 9440 \| 1.1290 \| 0.7018 \|
	\| 1.0798 \| 17.0 \| 10030 \| 1.1366 \| 0.7049 \|
	\| 1.0665 \| 18.0 \| 10620 \| 1.1030 \| 0.7147 \|
	\| 1.0642 \| 19.0 \| 11210 \| 1.1100 \| 0.7168 \|
	\| 1.0498 \| 20.0 \| 11800 \| 1.1124 \| 0.7235 \|
	\| 0.9966 \| 21.0 \| 12390 \| 1.1192 \| 0.7211 \|
	\| 1.0178 \| 22.0 \| 12980 \| 1.0786 \| 0.7211 \|
	\| 0.9956 \| 23.0 \| 13570 \| 1.0710 \| 0.7024 \|
	\| 0.9896 \| 24.0 \| 14160 \| 1.0254 \| 0.7211 \|
	\| 0.9496 \| 25.0 \| 14750 \| 1.0181 \| 0.7217 \|
	\| 0.9755 \| 26.0 \| 15340 \| 1.0013 \| 0.7211 \|
	\| 0.9439 \| 27.0 \| 15930 \| 1.0014 \| 0.7153 \|
	\| 0.9151 \| 28.0 \| 16520 \| 0.9923 \| 0.7336 \|
	\| 0.8988 \| 29.0 \| 17110 \| 0.9776 \| 0.7318 \|
	\| 0.8962 \| 30.0 \| 17700 \| 0.9625 \| 0.7401 \|
	\| 0.8825 \| 31.0 \| 18290 \| 0.9702 \| 0.7346 \|
	\| 0.8734 \| 32.0 \| 18880 \| 0.9766 \| 0.7394 \|
	\| 0.8651 \| 33.0 \| 19470 \| 0.9443 \| 0.7394 \|
	\| 0.8404 \| 34.0 \| 20060 \| 0.9665 \| 0.7364 \|
	\| 0.8312 \| 35.0 \| 20650 \| 0.9290 \| 0.7370 \|
	\| 0.8401 \| 36.0 \| 21240 \| 0.9546 \| 0.7309 \|
	\| 0.8121 \| 37.0 \| 21830 \| 0.9287 \| 0.7391 \|
	\| 0.8162 \| 38.0 \| 22420 \| 0.9171 \| 0.7278 \|
	\| 0.8096 \| 39.0 \| 23010 \| 0.9196 \| 0.7428 \|
	\| 0.7901 \| 40.0 \| 23600 \| 0.9168 \| 0.7422 \|
	\| 0.8011 \| 41.0 \| 24190 \| 0.9136 \| 0.7297 \|
	\| 0.7908 \| 42.0 \| 24780 \| 0.9080 \| 0.7385 \|
	\| 0.7755 \| 43.0 \| 25370 \| 0.9270 \| 0.7446 \|
	\| 0.786 \| 44.0 \| 25960 \| 0.8954 \| 0.7333 \|
	\| 0.7664 \| 45.0 \| 26550 \| 0.9038 \| 0.7410 \|
	\| 0.7725 \| 46.0 \| 27140 \| 0.8874 \| 0.7431 \|
	\| 0.7607 \| 47.0 \| 27730 \| 0.9019 \| 0.7416 \|
	\| 0.7683 \| 48.0 \| 28320 \| 0.9069 \| 0.7456 \|
	\| 0.7594 \| 49.0 \| 28910 \| 0.9003 \| 0.7318 \|
	\| 0.7317 \| 50.0 \| 29500 \| 0.8860 \| 0.7428 \|
	\| 0.7306 \| 51.0 \| 30090 \| 0.8862 \| 0.7434 \|
	\| 0.736 \| 52.0 \| 30680 \| 0.8952 \| 0.7471 \|
	\| 0.7343 \| 53.0 \| 31270 \| 0.8761 \| 0.7419 \|
	\| 0.7248 \| 54.0 \| 31860 \| 0.8876 \| 0.7309 \|
	\| 0.7334 \| 55.0 \| 32450 \| 0.8841 \| 0.7431 \|
	\| 0.7458 \| 56.0 \| 33040 \| 0.8817 \| 0.7434 \|
	\| 0.727 \| 57.0 \| 33630 \| 0.8743 \| 0.7431 \|
	\| 0.7077 \| 58.0 \| 34220 \| 0.8741 \| 0.7422 \|
	\| 0.7158 \| 59.0 \| 34810 \| 0.8768 \| 0.7446 \|
	\| 0.7061 \| 60.0 \| 35400 \| 0.8743 \| 0.7407 \|


	### Framework versions

	- Transformers 4.30.0
	- Pytorch 2.0.1+cu117
	- Datasets 2.14.4
	- Tokenizers 0.13.3

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets:
	- super_glue
	metrics:
	- accuracy
	model-index:
	- name: 2_5e-3_10_0.5
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# 2_5e-3_10_0.5

	This model is a fine-tuned version of [bert-large-uncased](https://huggingface.co/bert-large-uncased) on the super_glue dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.8743
	- Accuracy: 0.7407

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.005
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 11
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 60.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:--------:\|
	\| 2.0951 \| 1.0 \| 590 \| 2.8478 \| 0.6208 \|
	\| 2.0966 \| 2.0 \| 1180 \| 2.0402 \| 0.6208 \|
	\| 1.9864 \| 3.0 \| 1770 \| 2.9563 \| 0.4196 \|
	\| 1.9962 \| 4.0 \| 2360 \| 2.4148 \| 0.4905 \|
	\| 1.8743 \| 5.0 \| 2950 \| 2.1057 \| 0.6217 \|
	\| 1.562 \| 6.0 \| 3540 \| 1.6253 \| 0.6636 \|
	\| 1.4913 \| 7.0 \| 4130 \| 1.4832 \| 0.6734 \|
	\| 1.4114 \| 8.0 \| 4720 \| 1.4386 \| 0.6560 \|
	\| 1.3732 \| 9.0 \| 5310 \| 1.4139 \| 0.6508 \|
	\| 1.3161 \| 10.0 \| 5900 \| 1.3009 \| 0.6893 \|
	\| 1.2979 \| 11.0 \| 6490 \| 1.2760 \| 0.6963 \|
	\| 1.1837 \| 12.0 \| 7080 \| 1.2606 \| 0.6737 \|
	\| 1.2171 \| 13.0 \| 7670 \| 1.2241 \| 0.7040 \|
	\| 1.1545 \| 14.0 \| 8260 \| 1.2533 \| 0.7086 \|
	\| 1.1424 \| 15.0 \| 8850 \| 1.1613 \| 0.7061 \|
	\| 1.1106 \| 16.0 \| 9440 \| 1.1290 \| 0.7018 \|
	\| 1.0798 \| 17.0 \| 10030 \| 1.1366 \| 0.7049 \|
	\| 1.0665 \| 18.0 \| 10620 \| 1.1030 \| 0.7147 \|
	\| 1.0642 \| 19.0 \| 11210 \| 1.1100 \| 0.7168 \|
	\| 1.0498 \| 20.0 \| 11800 \| 1.1124 \| 0.7235 \|
	\| 0.9966 \| 21.0 \| 12390 \| 1.1192 \| 0.7211 \|
	\| 1.0178 \| 22.0 \| 12980 \| 1.0786 \| 0.7211 \|
	\| 0.9956 \| 23.0 \| 13570 \| 1.0710 \| 0.7024 \|
	\| 0.9896 \| 24.0 \| 14160 \| 1.0254 \| 0.7211 \|
	\| 0.9496 \| 25.0 \| 14750 \| 1.0181 \| 0.7217 \|
	\| 0.9755 \| 26.0 \| 15340 \| 1.0013 \| 0.7211 \|
	\| 0.9439 \| 27.0 \| 15930 \| 1.0014 \| 0.7153 \|
	\| 0.9151 \| 28.0 \| 16520 \| 0.9923 \| 0.7336 \|
	\| 0.8988 \| 29.0 \| 17110 \| 0.9776 \| 0.7318 \|
	\| 0.8962 \| 30.0 \| 17700 \| 0.9625 \| 0.7401 \|
	\| 0.8825 \| 31.0 \| 18290 \| 0.9702 \| 0.7346 \|
	\| 0.8734 \| 32.0 \| 18880 \| 0.9766 \| 0.7394 \|
	\| 0.8651 \| 33.0 \| 19470 \| 0.9443 \| 0.7394 \|
	\| 0.8404 \| 34.0 \| 20060 \| 0.9665 \| 0.7364 \|
	\| 0.8312 \| 35.0 \| 20650 \| 0.9290 \| 0.7370 \|
	\| 0.8401 \| 36.0 \| 21240 \| 0.9546 \| 0.7309 \|
	\| 0.8121 \| 37.0 \| 21830 \| 0.9287 \| 0.7391 \|
	\| 0.8162 \| 38.0 \| 22420 \| 0.9171 \| 0.7278 \|
	\| 0.8096 \| 39.0 \| 23010 \| 0.9196 \| 0.7428 \|
	\| 0.7901 \| 40.0 \| 23600 \| 0.9168 \| 0.7422 \|
	\| 0.8011 \| 41.0 \| 24190 \| 0.9136 \| 0.7297 \|
	\| 0.7908 \| 42.0 \| 24780 \| 0.9080 \| 0.7385 \|
	\| 0.7755 \| 43.0 \| 25370 \| 0.9270 \| 0.7446 \|
	\| 0.786 \| 44.0 \| 25960 \| 0.8954 \| 0.7333 \|
	\| 0.7664 \| 45.0 \| 26550 \| 0.9038 \| 0.7410 \|
	\| 0.7725 \| 46.0 \| 27140 \| 0.8874 \| 0.7431 \|
	\| 0.7607 \| 47.0 \| 27730 \| 0.9019 \| 0.7416 \|
	\| 0.7683 \| 48.0 \| 28320 \| 0.9069 \| 0.7456 \|
	\| 0.7594 \| 49.0 \| 28910 \| 0.9003 \| 0.7318 \|
	\| 0.7317 \| 50.0 \| 29500 \| 0.8860 \| 0.7428 \|
	\| 0.7306 \| 51.0 \| 30090 \| 0.8862 \| 0.7434 \|
	\| 0.736 \| 52.0 \| 30680 \| 0.8952 \| 0.7471 \|
	\| 0.7343 \| 53.0 \| 31270 \| 0.8761 \| 0.7419 \|
	\| 0.7248 \| 54.0 \| 31860 \| 0.8876 \| 0.7309 \|
	\| 0.7334 \| 55.0 \| 32450 \| 0.8841 \| 0.7431 \|
	\| 0.7458 \| 56.0 \| 33040 \| 0.8817 \| 0.7434 \|
	\| 0.727 \| 57.0 \| 33630 \| 0.8743 \| 0.7431 \|
	\| 0.7077 \| 58.0 \| 34220 \| 0.8741 \| 0.7422 \|
	\| 0.7158 \| 59.0 \| 34810 \| 0.8768 \| 0.7446 \|
	\| 0.7061 \| 60.0 \| 35400 \| 0.8743 \| 0.7407 \|


	### Framework versions

	- Transformers 4.30.0
	- Pytorch 2.0.1+cu117
	- Datasets 2.14.4
	- Tokenizers 0.13.3