bert-tiny-mlm-finetuned-imdb / README.md

muhtasham

update model card README.md

986c9c9 over 1 year ago

preview code

raw

history blame contribute delete

No virus

3.76 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	model-index:
	- name: bert-tiny-mlm-finetuned-imdb
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-tiny-mlm-finetuned-imdb

	This model is a fine-tuned version of [google/bert_uncased_L-2_H-128_A-2](https://huggingface.co/google/bert_uncased_L-2_H-128_A-2) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.4487

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 128
	- eval_batch_size: 128
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- num_epochs: 200

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 4.1774 \| 1.04 \| 500 \| 3.7705 \|
	\| 4.041 \| 2.09 \| 1000 \| 3.7196 \|
	\| 3.9982 \| 3.13 \| 1500 \| 3.6826 \|
	\| 3.9614 \| 4.18 \| 2000 \| 3.6543 \|
	\| 3.9274 \| 5.22 \| 2500 \| 3.6438 \|
	\| 3.9089 \| 6.26 \| 3000 \| 3.6294 \|
	\| 3.8929 \| 7.31 \| 3500 \| 3.6217 \|
	\| 3.873 \| 8.35 \| 4000 \| 3.6083 \|
	\| 3.8659 \| 9.39 \| 4500 \| 3.5900 \|
	\| 3.8484 \| 10.44 \| 5000 \| 3.5791 \|
	\| 3.8261 \| 11.48 \| 5500 \| 3.5731 \|
	\| 3.8228 \| 12.53 \| 6000 \| 3.5579 \|
	\| 3.8098 \| 13.57 \| 6500 \| 3.5576 \|
	\| 3.8028 \| 14.61 \| 7000 \| 3.5532 \|
	\| 3.7881 \| 15.66 \| 7500 \| 3.5440 \|
	\| 3.7829 \| 16.7 \| 8000 \| 3.5440 \|
	\| 3.7727 \| 17.75 \| 8500 \| 3.5372 \|
	\| 3.7648 \| 18.79 \| 9000 \| 3.5248 \|
	\| 3.7504 \| 19.83 \| 9500 \| 3.5223 \|
	\| 3.7487 \| 20.88 \| 10000 \| 3.5212 \|
	\| 3.7497 \| 21.92 \| 10500 \| 3.5166 \|
	\| 3.7344 \| 22.96 \| 11000 \| 3.5103 \|
	\| 3.7339 \| 24.01 \| 11500 \| 3.5052 \|
	\| 3.722 \| 25.05 \| 12000 \| 3.5067 \|
	\| 3.7188 \| 26.1 \| 12500 \| 3.4941 \|
	\| 3.7127 \| 27.14 \| 13000 \| 3.4951 \|
	\| 3.7113 \| 28.18 \| 13500 \| 3.4904 \|
	\| 3.7042 \| 29.23 \| 14000 \| 3.4813 \|
	\| 3.7011 \| 30.27 \| 14500 \| 3.4805 \|
	\| 3.6936 \| 31.32 \| 15000 \| 3.4886 \|
	\| 3.6889 \| 32.36 \| 15500 \| 3.4825 \|
	\| 3.6771 \| 33.4 \| 16000 \| 3.4785 \|
	\| 3.6753 \| 34.45 \| 16500 \| 3.4819 \|
	\| 3.6743 \| 35.49 \| 17000 \| 3.4744 \|
	\| 3.6686 \| 36.53 \| 17500 \| 3.4658 \|
	\| 3.669 \| 37.58 \| 18000 \| 3.4607 \|
	\| 3.6623 \| 38.62 \| 18500 \| 3.4688 \|
	\| 3.6648 \| 39.67 \| 19000 \| 3.4676 \|
	\| 3.6574 \| 40.71 \| 19500 \| 3.4581 \|
	\| 3.652 \| 41.75 \| 20000 \| 3.4601 \|
	\| 3.6506 \| 42.8 \| 20500 \| 3.4630 \|
	\| 3.6466 \| 43.84 \| 21000 \| 3.4530 \|
	\| 3.637 \| 44.89 \| 21500 \| 3.4507 \|
	\| 3.6428 \| 45.93 \| 22000 \| 3.4557 \|
	\| 3.6408 \| 46.97 \| 22500 \| 3.4483 \|
	\| 3.6368 \| 48.02 \| 23000 \| 3.4505 \|
	\| 3.6322 \| 49.06 \| 23500 \| 3.4494 \|
	\| 3.6256 \| 50.1 \| 24000 \| 3.4487 \|


	### Framework versions

	- Transformers 4.25.1
	- Pytorch 1.12.1+cu113
	- Datasets 2.7.1
	- Tokenizers 0.13.2

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	model-index:
	- name: bert-tiny-mlm-finetuned-imdb
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-tiny-mlm-finetuned-imdb

	This model is a fine-tuned version of [google/bert_uncased_L-2_H-128_A-2](https://huggingface.co/google/bert_uncased_L-2_H-128_A-2) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.4487

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 128
	- eval_batch_size: 128
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: constant
	- num_epochs: 200

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 4.1774 \| 1.04 \| 500 \| 3.7705 \|
	\| 4.041 \| 2.09 \| 1000 \| 3.7196 \|
	\| 3.9982 \| 3.13 \| 1500 \| 3.6826 \|
	\| 3.9614 \| 4.18 \| 2000 \| 3.6543 \|
	\| 3.9274 \| 5.22 \| 2500 \| 3.6438 \|
	\| 3.9089 \| 6.26 \| 3000 \| 3.6294 \|
	\| 3.8929 \| 7.31 \| 3500 \| 3.6217 \|
	\| 3.873 \| 8.35 \| 4000 \| 3.6083 \|
	\| 3.8659 \| 9.39 \| 4500 \| 3.5900 \|
	\| 3.8484 \| 10.44 \| 5000 \| 3.5791 \|
	\| 3.8261 \| 11.48 \| 5500 \| 3.5731 \|
	\| 3.8228 \| 12.53 \| 6000 \| 3.5579 \|
	\| 3.8098 \| 13.57 \| 6500 \| 3.5576 \|
	\| 3.8028 \| 14.61 \| 7000 \| 3.5532 \|
	\| 3.7881 \| 15.66 \| 7500 \| 3.5440 \|
	\| 3.7829 \| 16.7 \| 8000 \| 3.5440 \|
	\| 3.7727 \| 17.75 \| 8500 \| 3.5372 \|
	\| 3.7648 \| 18.79 \| 9000 \| 3.5248 \|
	\| 3.7504 \| 19.83 \| 9500 \| 3.5223 \|
	\| 3.7487 \| 20.88 \| 10000 \| 3.5212 \|
	\| 3.7497 \| 21.92 \| 10500 \| 3.5166 \|
	\| 3.7344 \| 22.96 \| 11000 \| 3.5103 \|
	\| 3.7339 \| 24.01 \| 11500 \| 3.5052 \|
	\| 3.722 \| 25.05 \| 12000 \| 3.5067 \|
	\| 3.7188 \| 26.1 \| 12500 \| 3.4941 \|
	\| 3.7127 \| 27.14 \| 13000 \| 3.4951 \|
	\| 3.7113 \| 28.18 \| 13500 \| 3.4904 \|
	\| 3.7042 \| 29.23 \| 14000 \| 3.4813 \|
	\| 3.7011 \| 30.27 \| 14500 \| 3.4805 \|
	\| 3.6936 \| 31.32 \| 15000 \| 3.4886 \|
	\| 3.6889 \| 32.36 \| 15500 \| 3.4825 \|
	\| 3.6771 \| 33.4 \| 16000 \| 3.4785 \|
	\| 3.6753 \| 34.45 \| 16500 \| 3.4819 \|
	\| 3.6743 \| 35.49 \| 17000 \| 3.4744 \|
	\| 3.6686 \| 36.53 \| 17500 \| 3.4658 \|
	\| 3.669 \| 37.58 \| 18000 \| 3.4607 \|
	\| 3.6623 \| 38.62 \| 18500 \| 3.4688 \|
	\| 3.6648 \| 39.67 \| 19000 \| 3.4676 \|
	\| 3.6574 \| 40.71 \| 19500 \| 3.4581 \|
	\| 3.652 \| 41.75 \| 20000 \| 3.4601 \|
	\| 3.6506 \| 42.8 \| 20500 \| 3.4630 \|
	\| 3.6466 \| 43.84 \| 21000 \| 3.4530 \|
	\| 3.637 \| 44.89 \| 21500 \| 3.4507 \|
	\| 3.6428 \| 45.93 \| 22000 \| 3.4557 \|
	\| 3.6408 \| 46.97 \| 22500 \| 3.4483 \|
	\| 3.6368 \| 48.02 \| 23000 \| 3.4505 \|
	\| 3.6322 \| 49.06 \| 23500 \| 3.4494 \|
	\| 3.6256 \| 50.1 \| 24000 \| 3.4487 \|


	### Framework versions

	- Transformers 4.25.1
	- Pytorch 1.12.1+cu113
	- Datasets 2.7.1
	- Tokenizers 0.13.2