gary109
/

wikitext_roberta-base

Fill-Mask Transformers PyTorch TensorBoard

roberta generated_from_trainer Eval Results Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

wikitext_roberta-base / README.md

gary109's picture

update model card README.md

b067546 almost 2 years ago

|

raw history blame contribute delete

No virus

2.94 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	datasets:
	- wikitext
	metrics:
	- accuracy
	model-index:
	- name: wikitext_roberta-base
	results:
	- task:
	name: Masked Language Modeling
	type: fill-mask
	dataset:
	name: wikitext wikitext-2-raw-v1
	type: wikitext
	args: wikitext-2-raw-v1
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.7371052344006119
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# wikitext_roberta-base

	This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the wikitext wikitext-2-raw-v1 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2143
	- Accuracy: 0.7371

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 50
	- num_epochs: 20.0
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| 1.4175 \| 0.99 \| 37 \| 1.3355 \| 0.7194 \|
	\| 1.438 \| 1.99 \| 74 \| 1.2953 \| 0.7249 \|
	\| 1.4363 \| 2.99 \| 111 \| 1.2759 \| 0.7276 \|
	\| 1.3391 \| 3.99 \| 148 \| 1.2904 \| 0.7252 \|
	\| 1.3741 \| 4.99 \| 185 \| 1.2621 \| 0.7290 \|
	\| 1.2771 \| 5.99 \| 222 \| 1.2312 \| 0.7353 \|
	\| 1.287 \| 6.99 \| 259 \| 1.2542 \| 0.7289 \|
	\| 1.29 \| 7.99 \| 296 \| 1.2290 \| 0.7345 \|
	\| 1.2948 \| 8.99 \| 333 \| 1.2537 \| 0.7286 \|
	\| 1.2741 \| 9.99 \| 370 \| 1.2199 \| 0.7354 \|
	\| 1.2342 \| 10.99 \| 407 \| 1.2520 \| 0.7309 \|
	\| 1.2199 \| 11.99 \| 444 \| 1.2738 \| 0.7260 \|
	\| 1.206 \| 12.99 \| 481 \| 1.2286 \| 0.7335 \|
	\| 1.221 \| 13.99 \| 518 \| 1.2421 \| 0.7327 \|
	\| 1.2062 \| 14.99 \| 555 \| 1.2402 \| 0.7328 \|
	\| 1.2305 \| 15.99 \| 592 \| 1.2473 \| 0.7308 \|
	\| 1.2426 \| 16.99 \| 629 \| 1.2250 \| 0.7318 \|
	\| 1.2096 \| 17.99 \| 666 \| 1.2186 \| 0.7353 \|
	\| 1.1961 \| 18.99 \| 703 \| 1.2214 \| 0.7361 \|
	\| 1.2136 \| 19.99 \| 740 \| 1.2506 \| 0.7311 \|


	### Framework versions

	- Transformers 4.21.0.dev0
	- Pytorch 1.11.0+cu113
	- Datasets 2.3.3.dev0
	- Tokenizers 0.12.1