gokuls
/

mobilebert_sa_pre-training-complete

generated_from_trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mobilebert_sa_pre-training-complete / README.md

gokuls's picture

End of training

561d147 over 1 year ago

|

raw history blame contribute delete

No virus

4.45 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	datasets:
	- wikitext
	metrics:
	- accuracy
	model-index:
	- name: mobilebert_sa_pre-training-complete
	results:
	- task:
	name: Masked Language Modeling
	type: fill-mask
	dataset:
	name: wikitext wikitext-103-raw-v1
	type: wikitext
	config: wikitext-103-raw-v1
	split: validation
	args: wikitext-103-raw-v1
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.7161816392520737
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mobilebert_sa_pre-training-complete

	This model is a fine-tuned version of [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) on the wikitext wikitext-103-raw-v1 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.3239
	- Accuracy: 0.7162

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 10
	- distributed_type: multi-GPU
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 100
	- training_steps: 300000

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:--------:\|
	\| 1.6028 \| 1.0 \| 7145 \| 1.4525 \| 0.6935 \|
	\| 1.5524 \| 2.0 \| 14290 \| 1.4375 \| 0.6993 \|
	\| 1.5323 \| 3.0 \| 21435 \| 1.4194 \| 0.6993 \|
	\| 1.5191 \| 4.0 \| 28580 \| 1.4110 \| 0.7027 \|
	\| 1.5025 \| 5.0 \| 35725 \| 1.4168 \| 0.7014 \|
	\| 1.4902 \| 6.0 \| 42870 \| 1.3931 \| 0.7012 \|
	\| 1.4813 \| 7.0 \| 50015 \| 1.3738 \| 0.7057 \|
	\| 1.4751 \| 8.0 \| 57160 \| 1.4237 \| 0.6996 \|
	\| 1.4689 \| 9.0 \| 64305 \| 1.3969 \| 0.7047 \|
	\| 1.4626 \| 10.0 \| 71450 \| 1.3916 \| 0.7068 \|
	\| 1.4566 \| 11.0 \| 78595 \| 1.3686 \| 0.7072 \|
	\| 1.451 \| 12.0 \| 85740 \| 1.3811 \| 0.7060 \|
	\| 1.4478 \| 13.0 \| 92885 \| 1.3598 \| 0.7092 \|
	\| 1.4441 \| 14.0 \| 100030 \| 1.3790 \| 0.7054 \|
	\| 1.4379 \| 15.0 \| 107175 \| 1.3794 \| 0.7066 \|
	\| 1.4353 \| 16.0 \| 114320 \| 1.3609 \| 0.7102 \|
	\| 1.43 \| 17.0 \| 121465 \| 1.3685 \| 0.7083 \|
	\| 1.4278 \| 18.0 \| 128610 \| 1.3953 \| 0.7036 \|
	\| 1.4219 \| 19.0 \| 135755 \| 1.3756 \| 0.7085 \|
	\| 1.4197 \| 20.0 \| 142900 \| 1.3597 \| 0.7090 \|
	\| 1.4169 \| 21.0 \| 150045 \| 1.3673 \| 0.7061 \|
	\| 1.4146 \| 22.0 \| 157190 \| 1.3753 \| 0.7073 \|
	\| 1.4109 \| 23.0 \| 164335 \| 1.3696 \| 0.7082 \|
	\| 1.4073 \| 24.0 \| 171480 \| 1.3563 \| 0.7092 \|
	\| 1.4054 \| 25.0 \| 178625 \| 1.3712 \| 0.7103 \|
	\| 1.402 \| 26.0 \| 185770 \| 1.3528 \| 0.7113 \|
	\| 1.4001 \| 27.0 \| 192915 \| 1.3367 \| 0.7123 \|
	\| 1.397 \| 28.0 \| 200060 \| 1.3508 \| 0.7118 \|
	\| 1.3955 \| 29.0 \| 207205 \| 1.3572 \| 0.7117 \|
	\| 1.3937 \| 30.0 \| 214350 \| 1.3566 \| 0.7095 \|
	\| 1.3901 \| 31.0 \| 221495 \| 1.3515 \| 0.7117 \|
	\| 1.3874 \| 32.0 \| 228640 \| 1.3445 \| 0.7118 \|
	\| 1.386 \| 33.0 \| 235785 \| 1.3611 \| 0.7097 \|
	\| 1.3833 \| 34.0 \| 242930 \| 1.3502 \| 0.7087 \|
	\| 1.3822 \| 35.0 \| 250075 \| 1.3657 \| 0.7108 \|
	\| 1.3797 \| 36.0 \| 257220 \| 1.3576 \| 0.7108 \|
	\| 1.3793 \| 37.0 \| 264365 \| 1.3472 \| 0.7106 \|
	\| 1.3763 \| 38.0 \| 271510 \| 1.3323 \| 0.7156 \|
	\| 1.3762 \| 39.0 \| 278655 \| 1.3325 \| 0.7145 \|
	\| 1.3748 \| 40.0 \| 285800 \| 1.3243 \| 0.7138 \|
	\| 1.3733 \| 41.0 \| 292945 \| 1.3218 \| 0.7170 \|
	\| 1.3722 \| 41.99 \| 300000 \| 1.3074 \| 0.7186 \|


	### Framework versions

	- Transformers 4.26.0
	- Pytorch 1.14.0a0+410ce96
	- Datasets 2.9.0
	- Tokenizers 0.13.2