update model card README.md

0db2890 about 1 year ago

4.71 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: dit-tiny_rvl_cdip_100_examples_per_class_kd_MSE_lr_fix
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# dit-tiny_rvl_cdip_100_examples_per_class_kd_MSE_lr_fix

	This model is a fine-tuned version of [microsoft/dit-base](https://huggingface.co/microsoft/dit-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.4358
	- Accuracy: 0.195
	- Brier Loss: 0.9035
	- Nll: 12.0550
	- F1 Micro: 0.195
	- F1 Macro: 0.1471
	- Ece: 0.1675
	- Aurc: 0.6988

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 25

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Brier Loss \| Nll \| F1 Micro \| F1 Macro \| Ece \| Aurc \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:----------:\|:-------:\|:--------:\|:--------:\|:------:\|:------:\|
	\| No log \| 1.0 \| 25 \| 1.5167 \| 0.07 \| 0.9368 \| 20.8948 \| 0.07 \| 0.0305 \| 0.1106 \| 0.8850 \|
	\| No log \| 2.0 \| 50 \| 1.5246 \| 0.08 \| 0.9362 \| 21.4368 \| 0.08 \| 0.0346 \| 0.1200 \| 0.8659 \|
	\| No log \| 3.0 \| 75 \| 1.5053 \| 0.1 \| 0.9340 \| 23.7241 \| 0.1000 \| 0.0522 \| 0.1280 \| 0.8087 \|
	\| No log \| 4.0 \| 100 \| 1.5097 \| 0.0975 \| 0.9322 \| 17.3004 \| 0.0975 \| 0.0487 \| 0.1220 \| 0.8220 \|
	\| No log \| 5.0 \| 125 \| 1.4926 \| 0.12 \| 0.9296 \| 16.3893 \| 0.12 \| 0.0600 \| 0.1284 \| 0.7752 \|
	\| No log \| 6.0 \| 150 \| 1.4838 \| 0.105 \| 0.9273 \| 19.3692 \| 0.1050 \| 0.0356 \| 0.1254 \| 0.7955 \|
	\| No log \| 7.0 \| 175 \| 1.4729 \| 0.0975 \| 0.9229 \| 18.6899 \| 0.0975 \| 0.0411 \| 0.1134 \| 0.7963 \|
	\| No log \| 8.0 \| 200 \| 1.4754 \| 0.125 \| 0.9196 \| 17.7842 \| 0.125 \| 0.0676 \| 0.1238 \| 0.7778 \|
	\| No log \| 9.0 \| 225 \| 1.4725 \| 0.1125 \| 0.9193 \| 16.6572 \| 0.1125 \| 0.0505 \| 0.1254 \| 0.7839 \|
	\| No log \| 10.0 \| 250 \| 1.4702 \| 0.1175 \| 0.9168 \| 16.3975 \| 0.1175 \| 0.0556 \| 0.1183 \| 0.7638 \|
	\| No log \| 11.0 \| 275 \| 1.4648 \| 0.1175 \| 0.9169 \| 18.4274 \| 0.1175 \| 0.0558 \| 0.1219 \| 0.7806 \|
	\| No log \| 12.0 \| 300 \| 1.4660 \| 0.155 \| 0.9166 \| 15.6492 \| 0.155 \| 0.0791 \| 0.1411 \| 0.7512 \|
	\| No log \| 13.0 \| 325 \| 1.4684 \| 0.16 \| 0.9164 \| 17.1698 \| 0.16 \| 0.1140 \| 0.1519 \| 0.7285 \|
	\| No log \| 14.0 \| 350 \| 1.4662 \| 0.1175 \| 0.9158 \| 17.6999 \| 0.1175 \| 0.0501 \| 0.1269 \| 0.7637 \|
	\| No log \| 15.0 \| 375 \| 1.4602 \| 0.1675 \| 0.9143 \| 13.2540 \| 0.1675 \| 0.1153 \| 0.1515 \| 0.7223 \|
	\| No log \| 16.0 \| 400 \| 1.4556 \| 0.1325 \| 0.9138 \| 13.3868 \| 0.1325 \| 0.0881 \| 0.1323 \| 0.7558 \|
	\| No log \| 17.0 \| 425 \| 1.4527 \| 0.175 \| 0.9128 \| 11.1983 \| 0.175 \| 0.1334 \| 0.1596 \| 0.7153 \|
	\| No log \| 18.0 \| 450 \| 1.4535 \| 0.1625 \| 0.9111 \| 17.6046 \| 0.1625 \| 0.1021 \| 0.1435 \| 0.7379 \|
	\| No log \| 19.0 \| 475 \| 1.4453 \| 0.1825 \| 0.9086 \| 11.8948 \| 0.1825 \| 0.1228 \| 0.1594 \| 0.7098 \|
	\| 1.4614 \| 20.0 \| 500 \| 1.4431 \| 0.1525 \| 0.9078 \| 14.2631 \| 0.1525 \| 0.1115 \| 0.1410 \| 0.7293 \|
	\| 1.4614 \| 21.0 \| 525 \| 1.4392 \| 0.1825 \| 0.9063 \| 10.7664 \| 0.1825 \| 0.1378 \| 0.1567 \| 0.7058 \|
	\| 1.4614 \| 22.0 \| 550 \| 1.4469 \| 0.1775 \| 0.9055 \| 13.4724 \| 0.1775 \| 0.1212 \| 0.1483 \| 0.7107 \|
	\| 1.4614 \| 23.0 \| 575 \| 1.4356 \| 0.17 \| 0.9039 \| 11.8141 \| 0.17 \| 0.1232 \| 0.1515 \| 0.7091 \|
	\| 1.4614 \| 24.0 \| 600 \| 1.4370 \| 0.1875 \| 0.9039 \| 12.9338 \| 0.1875 \| 0.1384 \| 0.1539 \| 0.7017 \|
	\| 1.4614 \| 25.0 \| 625 \| 1.4358 \| 0.195 \| 0.9035 \| 12.0550 \| 0.195 \| 0.1471 \| 0.1675 \| 0.6988 \|


	### Framework versions

	- Transformers 4.28.0.dev0
	- Pytorch 1.12.1+cu113
	- Datasets 2.12.0
	- Tokenizers 0.12.1