dit_base

This model is a fine-tuned version of microsoft/dit-base on the davanstrien/leicester_loaded_annotations dataset. It achieves the following results on the evaluation set:

Loss: 0.4527
Accuracy: 0.8190

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.89	6	1.7452	0.4095
1.8958	1.89	12	1.6185	0.4286
1.8958	2.89	18	1.4731	0.4857
1.8466	3.89	24	1.3459	0.5524
1.445	4.89	30	1.1766	0.5810
1.445	5.89	36	1.0902	0.6381
1.2077	6.89	42	0.9331	0.6762
1.2077	7.89	48	0.8431	0.6762
1.0254	8.89	54	0.8657	0.6857
0.8275	9.89	60	0.6801	0.7429
0.8275	10.89	66	0.6699	0.7810
0.8063	11.89	72	0.6296	0.7524
0.8063	12.89	78	0.5498	0.7905
0.7127	13.89	84	0.4974	0.8381
0.6356	14.89	90	0.6715	0.7619
0.6356	15.89	96	0.4602	0.8095
0.6438	16.89	102	0.4886	0.8095
0.6438	17.89	108	0.4332	0.8
0.5329	18.89	114	0.4197	0.8095
0.4932	19.89	120	0.4168	0.8190
0.4932	20.89	126	0.4691	0.8
0.4861	21.89	132	0.4263	0.8476
0.4861	22.89	138	0.4464	0.8190
0.4935	23.89	144	0.4857	0.7905
0.433	24.89	150	0.4873	0.7810
0.433	25.89	156	0.4641	0.8095
0.4289	26.89	162	0.5316	0.8
0.4289	27.89	168	0.3389	0.8571
0.4204	28.89	174	0.4272	0.8
0.3668	29.89	180	0.3493	0.8667
0.3668	30.89	186	0.3861	0.8571
0.4101	31.89	192	0.4216	0.8381
0.4101	32.89	198	0.4258	0.8190
0.3614	33.89	204	0.4409	0.8571
0.3267	34.89	210	0.4475	0.8190
0.3267	35.89	216	0.4316	0.8190
0.3423	36.89	222	0.4095	0.8381
0.3423	37.89	228	0.4671	0.8286
0.3325	38.89	234	0.3994	0.8286
0.3326	39.89	240	0.5004	0.8190
0.3326	40.89	246	0.4103	0.8381
0.2964	41.89	252	0.4469	0.8286
0.2964	42.89	258	0.4774	0.8286
0.3435	43.89	264	0.3843	0.8381
0.3146	44.89	270	0.3710	0.8667
0.3146	45.89	276	0.3392	0.8667
0.3168	46.89	282	0.3597	0.8667
0.3168	47.89	288	0.4143	0.8381
0.3081	48.89	294	0.3579	0.8571
0.3103	49.89	300	0.4527	0.8190

Framework versions

Transformers 4.25.1
Pytorch 1.12.1
Datasets 2.7.1
Tokenizers 0.13.1

rchan26
/

dit_base

dit_base

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results