dit_base / README.md

Librarian Bot: Add base_model information to model

ebd372d 10 months ago

4.49 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	base_model: microsoft/dit-base
	model-index:
	- name: dit_base
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# dit_base

	This model is a fine-tuned version of [microsoft/dit-base](https://huggingface.co/microsoft/dit-base) on the davanstrien/leicester_loaded_annotations dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.4527
	- Accuracy: 0.8190

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| No log \| 0.89 \| 6 \| 1.7452 \| 0.4095 \|
	\| 1.8958 \| 1.89 \| 12 \| 1.6185 \| 0.4286 \|
	\| 1.8958 \| 2.89 \| 18 \| 1.4731 \| 0.4857 \|
	\| 1.8466 \| 3.89 \| 24 \| 1.3459 \| 0.5524 \|
	\| 1.445 \| 4.89 \| 30 \| 1.1766 \| 0.5810 \|
	\| 1.445 \| 5.89 \| 36 \| 1.0902 \| 0.6381 \|
	\| 1.2077 \| 6.89 \| 42 \| 0.9331 \| 0.6762 \|
	\| 1.2077 \| 7.89 \| 48 \| 0.8431 \| 0.6762 \|
	\| 1.0254 \| 8.89 \| 54 \| 0.8657 \| 0.6857 \|
	\| 0.8275 \| 9.89 \| 60 \| 0.6801 \| 0.7429 \|
	\| 0.8275 \| 10.89 \| 66 \| 0.6699 \| 0.7810 \|
	\| 0.8063 \| 11.89 \| 72 \| 0.6296 \| 0.7524 \|
	\| 0.8063 \| 12.89 \| 78 \| 0.5498 \| 0.7905 \|
	\| 0.7127 \| 13.89 \| 84 \| 0.4974 \| 0.8381 \|
	\| 0.6356 \| 14.89 \| 90 \| 0.6715 \| 0.7619 \|
	\| 0.6356 \| 15.89 \| 96 \| 0.4602 \| 0.8095 \|
	\| 0.6438 \| 16.89 \| 102 \| 0.4886 \| 0.8095 \|
	\| 0.6438 \| 17.89 \| 108 \| 0.4332 \| 0.8 \|
	\| 0.5329 \| 18.89 \| 114 \| 0.4197 \| 0.8095 \|
	\| 0.4932 \| 19.89 \| 120 \| 0.4168 \| 0.8190 \|
	\| 0.4932 \| 20.89 \| 126 \| 0.4691 \| 0.8 \|
	\| 0.4861 \| 21.89 \| 132 \| 0.4263 \| 0.8476 \|
	\| 0.4861 \| 22.89 \| 138 \| 0.4464 \| 0.8190 \|
	\| 0.4935 \| 23.89 \| 144 \| 0.4857 \| 0.7905 \|
	\| 0.433 \| 24.89 \| 150 \| 0.4873 \| 0.7810 \|
	\| 0.433 \| 25.89 \| 156 \| 0.4641 \| 0.8095 \|
	\| 0.4289 \| 26.89 \| 162 \| 0.5316 \| 0.8 \|
	\| 0.4289 \| 27.89 \| 168 \| 0.3389 \| 0.8571 \|
	\| 0.4204 \| 28.89 \| 174 \| 0.4272 \| 0.8 \|
	\| 0.3668 \| 29.89 \| 180 \| 0.3493 \| 0.8667 \|
	\| 0.3668 \| 30.89 \| 186 \| 0.3861 \| 0.8571 \|
	\| 0.4101 \| 31.89 \| 192 \| 0.4216 \| 0.8381 \|
	\| 0.4101 \| 32.89 \| 198 \| 0.4258 \| 0.8190 \|
	\| 0.3614 \| 33.89 \| 204 \| 0.4409 \| 0.8571 \|
	\| 0.3267 \| 34.89 \| 210 \| 0.4475 \| 0.8190 \|
	\| 0.3267 \| 35.89 \| 216 \| 0.4316 \| 0.8190 \|
	\| 0.3423 \| 36.89 \| 222 \| 0.4095 \| 0.8381 \|
	\| 0.3423 \| 37.89 \| 228 \| 0.4671 \| 0.8286 \|
	\| 0.3325 \| 38.89 \| 234 \| 0.3994 \| 0.8286 \|
	\| 0.3326 \| 39.89 \| 240 \| 0.5004 \| 0.8190 \|
	\| 0.3326 \| 40.89 \| 246 \| 0.4103 \| 0.8381 \|
	\| 0.2964 \| 41.89 \| 252 \| 0.4469 \| 0.8286 \|
	\| 0.2964 \| 42.89 \| 258 \| 0.4774 \| 0.8286 \|
	\| 0.3435 \| 43.89 \| 264 \| 0.3843 \| 0.8381 \|
	\| 0.3146 \| 44.89 \| 270 \| 0.3710 \| 0.8667 \|
	\| 0.3146 \| 45.89 \| 276 \| 0.3392 \| 0.8667 \|
	\| 0.3168 \| 46.89 \| 282 \| 0.3597 \| 0.8667 \|
	\| 0.3168 \| 47.89 \| 288 \| 0.4143 \| 0.8381 \|
	\| 0.3081 \| 48.89 \| 294 \| 0.3579 \| 0.8571 \|
	\| 0.3103 \| 49.89 \| 300 \| 0.4527 \| 0.8190 \|


	### Framework versions

	- Transformers 4.25.1
	- Pytorch 1.12.1
	- Datasets 2.7.1
	- Tokenizers 0.13.1

	---
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	base_model: microsoft/dit-base
	model-index:
	- name: dit_base
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# dit_base

	This model is a fine-tuned version of [microsoft/dit-base](https://huggingface.co/microsoft/dit-base) on the davanstrien/leicester_loaded_annotations dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.4527
	- Accuracy: 0.8190

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|
	\| No log \| 0.89 \| 6 \| 1.7452 \| 0.4095 \|
	\| 1.8958 \| 1.89 \| 12 \| 1.6185 \| 0.4286 \|
	\| 1.8958 \| 2.89 \| 18 \| 1.4731 \| 0.4857 \|
	\| 1.8466 \| 3.89 \| 24 \| 1.3459 \| 0.5524 \|
	\| 1.445 \| 4.89 \| 30 \| 1.1766 \| 0.5810 \|
	\| 1.445 \| 5.89 \| 36 \| 1.0902 \| 0.6381 \|
	\| 1.2077 \| 6.89 \| 42 \| 0.9331 \| 0.6762 \|
	\| 1.2077 \| 7.89 \| 48 \| 0.8431 \| 0.6762 \|
	\| 1.0254 \| 8.89 \| 54 \| 0.8657 \| 0.6857 \|
	\| 0.8275 \| 9.89 \| 60 \| 0.6801 \| 0.7429 \|
	\| 0.8275 \| 10.89 \| 66 \| 0.6699 \| 0.7810 \|
	\| 0.8063 \| 11.89 \| 72 \| 0.6296 \| 0.7524 \|
	\| 0.8063 \| 12.89 \| 78 \| 0.5498 \| 0.7905 \|
	\| 0.7127 \| 13.89 \| 84 \| 0.4974 \| 0.8381 \|
	\| 0.6356 \| 14.89 \| 90 \| 0.6715 \| 0.7619 \|
	\| 0.6356 \| 15.89 \| 96 \| 0.4602 \| 0.8095 \|
	\| 0.6438 \| 16.89 \| 102 \| 0.4886 \| 0.8095 \|
	\| 0.6438 \| 17.89 \| 108 \| 0.4332 \| 0.8 \|
	\| 0.5329 \| 18.89 \| 114 \| 0.4197 \| 0.8095 \|
	\| 0.4932 \| 19.89 \| 120 \| 0.4168 \| 0.8190 \|
	\| 0.4932 \| 20.89 \| 126 \| 0.4691 \| 0.8 \|
	\| 0.4861 \| 21.89 \| 132 \| 0.4263 \| 0.8476 \|
	\| 0.4861 \| 22.89 \| 138 \| 0.4464 \| 0.8190 \|
	\| 0.4935 \| 23.89 \| 144 \| 0.4857 \| 0.7905 \|
	\| 0.433 \| 24.89 \| 150 \| 0.4873 \| 0.7810 \|
	\| 0.433 \| 25.89 \| 156 \| 0.4641 \| 0.8095 \|
	\| 0.4289 \| 26.89 \| 162 \| 0.5316 \| 0.8 \|
	\| 0.4289 \| 27.89 \| 168 \| 0.3389 \| 0.8571 \|
	\| 0.4204 \| 28.89 \| 174 \| 0.4272 \| 0.8 \|
	\| 0.3668 \| 29.89 \| 180 \| 0.3493 \| 0.8667 \|
	\| 0.3668 \| 30.89 \| 186 \| 0.3861 \| 0.8571 \|
	\| 0.4101 \| 31.89 \| 192 \| 0.4216 \| 0.8381 \|
	\| 0.4101 \| 32.89 \| 198 \| 0.4258 \| 0.8190 \|
	\| 0.3614 \| 33.89 \| 204 \| 0.4409 \| 0.8571 \|
	\| 0.3267 \| 34.89 \| 210 \| 0.4475 \| 0.8190 \|
	\| 0.3267 \| 35.89 \| 216 \| 0.4316 \| 0.8190 \|
	\| 0.3423 \| 36.89 \| 222 \| 0.4095 \| 0.8381 \|
	\| 0.3423 \| 37.89 \| 228 \| 0.4671 \| 0.8286 \|
	\| 0.3325 \| 38.89 \| 234 \| 0.3994 \| 0.8286 \|
	\| 0.3326 \| 39.89 \| 240 \| 0.5004 \| 0.8190 \|
	\| 0.3326 \| 40.89 \| 246 \| 0.4103 \| 0.8381 \|
	\| 0.2964 \| 41.89 \| 252 \| 0.4469 \| 0.8286 \|
	\| 0.2964 \| 42.89 \| 258 \| 0.4774 \| 0.8286 \|
	\| 0.3435 \| 43.89 \| 264 \| 0.3843 \| 0.8381 \|
	\| 0.3146 \| 44.89 \| 270 \| 0.3710 \| 0.8667 \|
	\| 0.3146 \| 45.89 \| 276 \| 0.3392 \| 0.8667 \|
	\| 0.3168 \| 46.89 \| 282 \| 0.3597 \| 0.8667 \|
	\| 0.3168 \| 47.89 \| 288 \| 0.4143 \| 0.8381 \|
	\| 0.3081 \| 48.89 \| 294 \| 0.3579 \| 0.8571 \|
	\| 0.3103 \| 49.89 \| 300 \| 0.4527 \| 0.8190 \|


	### Framework versions

	- Transformers 4.25.1
	- Pytorch 1.12.1
	- Datasets 2.7.1
	- Tokenizers 0.13.1