dit-small_tobacco3482_kd_MSE / README.md

update model card README.md

5922620 about 1 year ago

4.72 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: dit-small_tobacco3482_kd_MSE
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# dit-small_tobacco3482_kd_MSE

	This model is a fine-tuned version of [microsoft/dit-base](https://huggingface.co/microsoft/dit-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 6.7275
	- Accuracy: 0.21
	- Brier Loss: 0.8834
	- Nll: 6.7677
	- F1 Micro: 0.2100
	- F1 Macro: 0.1146
	- Ece: 0.2647
	- Aurc: 0.7666

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 256
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 25

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Brier Loss \| Nll \| F1 Micro \| F1 Macro \| Ece \| Aurc \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:----------:\|:-------:\|:--------:\|:--------:\|:------:\|:------:\|
	\| No log \| 0.96 \| 3 \| 7.1014 \| 0.06 \| 0.9055 \| 7.9056 \| 0.06 \| 0.0114 \| 0.1732 \| 0.9050 \|
	\| No log \| 1.96 \| 6 \| 6.9659 \| 0.125 \| 0.8970 \| 10.1253 \| 0.125 \| 0.0631 \| 0.2010 \| 0.8465 \|
	\| No log \| 2.96 \| 9 \| 6.8528 \| 0.075 \| 0.8954 \| 7.0315 \| 0.075 \| 0.0258 \| 0.1912 \| 0.8871 \|
	\| No log \| 3.96 \| 12 \| 6.8522 \| 0.205 \| 0.8955 \| 7.0990 \| 0.205 \| 0.0776 \| 0.2426 \| 0.7588 \|
	\| No log \| 4.96 \| 15 \| 6.8465 \| 0.19 \| 0.8959 \| 7.1340 \| 0.19 \| 0.0627 \| 0.2308 \| 0.7536 \|
	\| No log \| 5.96 \| 18 \| 6.8246 \| 0.205 \| 0.8937 \| 7.1101 \| 0.205 \| 0.0867 \| 0.2410 \| 0.7354 \|
	\| No log \| 6.96 \| 21 \| 6.8054 \| 0.085 \| 0.8918 \| 7.0215 \| 0.085 \| 0.0435 \| 0.1847 \| 0.8289 \|
	\| No log \| 7.96 \| 24 \| 6.8025 \| 0.22 \| 0.8879 \| 6.8272 \| 0.22 \| 0.0967 \| 0.2487 \| 0.7438 \|
	\| No log \| 8.96 \| 27 \| 6.8045 \| 0.21 \| 0.8871 \| 6.3740 \| 0.2100 \| 0.0992 \| 0.2412 \| 0.7634 \|
	\| No log \| 9.96 \| 30 \| 6.8013 \| 0.22 \| 0.8869 \| 6.9538 \| 0.22 \| 0.1016 \| 0.2495 \| 0.7633 \|
	\| No log \| 10.96 \| 33 \| 6.7920 \| 0.215 \| 0.8865 \| 6.9670 \| 0.2150 \| 0.0968 \| 0.2549 \| 0.7577 \|
	\| No log \| 11.96 \| 36 \| 6.7817 \| 0.22 \| 0.8867 \| 6.9953 \| 0.22 \| 0.1004 \| 0.2455 \| 0.7437 \|
	\| No log \| 12.96 \| 39 \| 6.7729 \| 0.17 \| 0.8884 \| 6.9738 \| 0.17 \| 0.0891 \| 0.2277 \| 0.7865 \|
	\| No log \| 13.96 \| 42 \| 6.7632 \| 0.2 \| 0.8873 \| 6.9622 \| 0.2000 \| 0.0998 \| 0.2393 \| 0.7413 \|
	\| No log \| 14.96 \| 45 \| 6.7548 \| 0.215 \| 0.8860 \| 6.9576 \| 0.2150 \| 0.1010 \| 0.2635 \| 0.7189 \|
	\| No log \| 15.96 \| 48 \| 6.7489 \| 0.22 \| 0.8857 \| 6.8386 \| 0.22 \| 0.1024 \| 0.2665 \| 0.7098 \|
	\| No log \| 16.96 \| 51 \| 6.7457 \| 0.23 \| 0.8855 \| 6.8730 \| 0.23 \| 0.1129 \| 0.2506 \| 0.7217 \|
	\| No log \| 17.96 \| 54 \| 6.7455 \| 0.215 \| 0.8864 \| 6.8688 \| 0.2150 \| 0.1058 \| 0.2576 \| 0.7528 \|
	\| No log \| 18.96 \| 57 \| 6.7424 \| 0.16 \| 0.8861 \| 6.8631 \| 0.16 \| 0.0843 \| 0.2281 \| 0.8036 \|
	\| No log \| 19.96 \| 60 \| 6.7380 \| 0.155 \| 0.8850 \| 6.8443 \| 0.155 \| 0.0871 \| 0.2315 \| 0.7937 \|
	\| No log \| 20.96 \| 63 \| 6.7348 \| 0.195 \| 0.8841 \| 6.7769 \| 0.195 \| 0.0949 \| 0.2501 \| 0.7799 \|
	\| No log \| 21.96 \| 66 \| 6.7317 \| 0.175 \| 0.8838 \| 6.7692 \| 0.175 \| 0.1025 \| 0.2421 \| 0.7797 \|
	\| No log \| 22.96 \| 69 \| 6.7293 \| 0.175 \| 0.8836 \| 6.7682 \| 0.175 \| 0.1012 \| 0.2452 \| 0.7799 \|
	\| No log \| 23.96 \| 72 \| 6.7281 \| 0.205 \| 0.8834 \| 6.7672 \| 0.205 \| 0.1132 \| 0.2566 \| 0.7679 \|
	\| No log \| 24.96 \| 75 \| 6.7275 \| 0.21 \| 0.8834 \| 6.7677 \| 0.2100 \| 0.1146 \| 0.2647 \| 0.7666 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.13.1.post200
	- Datasets 2.9.0
	- Tokenizers 0.13.2