End of training

956a19c 9 months ago

No virus

4.62 kB

	---
	tags:
	- generated_from_trainer
	datasets:
	- gokuls/wiki_book_corpus_complete_processed_bert_dataset
	metrics:
	- accuracy
	model-index:
	- name: HBERTv1_emb_compress_48_L10_H256_A4
	results:
	- task:
	name: Masked Language Modeling
	type: fill-mask
	dataset:
	name: gokuls/wiki_book_corpus_complete_processed_bert_dataset
	type: gokuls/wiki_book_corpus_complete_processed_bert_dataset
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.15093352306316574
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# HBERTv1_emb_compress_48_L10_H256_A4

	This model is a fine-tuned version of [](https://huggingface.co/) on the gokuls/wiki_book_corpus_complete_processed_bert_dataset dataset.
	It achieves the following results on the evaluation set:
	- Loss: 6.0495
	- Accuracy: 0.1509

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 10
	- distributed_type: multi-GPU
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 10000
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:--------:\|
	\| 7.1164 \| 0.11 \| 10000 \| 7.0967 \| 0.0830 \|
	\| 6.694 \| 0.22 \| 20000 \| 6.6867 \| 0.1065 \|
	\| 6.545 \| 0.33 \| 30000 \| 6.5445 \| 0.1171 \|
	\| 6.4556 \| 0.44 \| 40000 \| 6.4527 \| 0.1250 \|
	\| 6.3891 \| 0.55 \| 50000 \| 6.3831 \| 0.1305 \|
	\| 6.3404 \| 0.66 \| 60000 \| 6.3334 \| 0.1350 \|
	\| 6.2962 \| 0.76 \| 70000 \| 6.2940 \| 0.1377 \|
	\| 6.2669 \| 0.87 \| 80000 \| 6.2629 \| 0.1398 \|
	\| 6.2352 \| 0.98 \| 90000 \| 6.2361 \| 0.1412 \|
	\| 6.2179 \| 1.09 \| 100000 \| 6.2150 \| 0.1429 \|
	\| 6.191 \| 1.2 \| 110000 \| 6.1970 \| 0.1443 \|
	\| 6.1809 \| 1.31 \| 120000 \| 6.1829 \| 0.1441 \|
	\| 6.1699 \| 1.42 \| 130000 \| 6.1692 \| 0.1455 \|
	\| 6.1623 \| 1.53 \| 140000 \| 6.1562 \| 0.1453 \|
	\| 6.1422 \| 1.64 \| 150000 \| 6.1480 \| 0.1468 \|
	\| 6.1397 \| 1.75 \| 160000 \| 6.1367 \| 0.1468 \|
	\| 6.1342 \| 1.86 \| 170000 \| 6.1284 \| 0.1475 \|
	\| 6.1291 \| 1.97 \| 180000 \| 6.1214 \| 0.1478 \|
	\| 6.1157 \| 2.08 \| 190000 \| 6.1132 \| 0.1483 \|
	\| 6.1146 \| 2.18 \| 200000 \| 6.1094 \| 0.1484 \|
	\| 6.1018 \| 2.29 \| 210000 \| 6.1013 \| 0.1488 \|
	\| 6.1014 \| 2.4 \| 220000 \| 6.0979 \| 0.1488 \|
	\| 6.0935 \| 2.51 \| 230000 \| 6.0936 \| 0.1489 \|
	\| 6.0899 \| 2.62 \| 240000 \| 6.0881 \| 0.1491 \|
	\| 6.0858 \| 2.73 \| 250000 \| 6.0851 \| 0.1498 \|
	\| 6.0872 \| 2.84 \| 260000 \| 6.0819 \| 0.1497 \|
	\| 6.0858 \| 2.95 \| 270000 \| 6.0784 \| 0.1500 \|
	\| 6.0775 \| 3.06 \| 280000 \| 6.0745 \| 0.1501 \|
	\| 6.0715 \| 3.17 \| 290000 \| 6.0720 \| 0.1502 \|
	\| 6.0704 \| 3.28 \| 300000 \| 6.0699 \| 0.1502 \|
	\| 6.0678 \| 3.39 \| 310000 \| 6.0668 \| 0.1503 \|
	\| 6.0662 \| 3.5 \| 320000 \| 6.0649 \| 0.1503 \|
	\| 6.0569 \| 3.6 \| 330000 \| 6.0622 \| 0.1505 \|
	\| 6.0604 \| 3.71 \| 340000 \| 6.0612 \| 0.1506 \|
	\| 6.0525 \| 3.82 \| 350000 \| 6.0586 \| 0.1507 \|
	\| 6.0553 \| 3.93 \| 360000 \| 6.0582 \| 0.1506 \|
	\| 6.053 \| 4.04 \| 370000 \| 6.0544 \| 0.1508 \|
	\| 6.0594 \| 4.15 \| 380000 \| 6.0553 \| 0.1507 \|
	\| 6.0488 \| 4.26 \| 390000 \| 6.0527 \| 0.1509 \|
	\| 6.051 \| 4.37 \| 400000 \| 6.0516 \| 0.1509 \|
	\| 6.0553 \| 4.48 \| 410000 \| 6.0518 \| 0.1509 \|
	\| 6.0507 \| 4.59 \| 420000 \| 6.0520 \| 0.1509 \|
	\| 6.0514 \| 4.7 \| 430000 \| 6.0501 \| 0.1509 \|
	\| 6.0511 \| 4.81 \| 440000 \| 6.0496 \| 0.1511 \|
	\| 6.0527 \| 4.92 \| 450000 \| 6.0493 \| 0.1509 \|


	### Framework versions

	- Transformers 4.33.2
	- Pytorch 1.14.0a0+410ce96
	- Datasets 2.14.5
	- Tokenizers 0.13.3