deit-base-distilled-patch16-224-55-fold3

This model is a fine-tuned version of facebook/deit-base-distilled-patch16-224 on the imagefolder dataset. It achieves the following results on the evaluation set:

Loss: 0.4529
Accuracy: 0.8228

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.8571	3	0.8367	0.4051
No log	2.0	7	0.7223	0.4557
0.7025	2.8571	10	0.7199	0.4684
0.7025	4.0	14	0.6096	0.7089
0.7025	4.8571	17	0.6278	0.5823
0.6356	6.0	21	0.5629	0.7089
0.6356	6.8571	24	0.5924	0.6835
0.6356	8.0	28	0.5365	0.7722
0.5493	8.8571	31	0.6082	0.6329
0.5493	10.0	35	0.7239	0.5949
0.5493	10.8571	38	0.5435	0.7722
0.5205	12.0	42	0.8530	0.5570
0.5205	12.8571	45	0.5530	0.6709
0.5205	14.0	49	0.4728	0.7722
0.4979	14.8571	52	0.9571	0.5570
0.4979	16.0	56	0.5193	0.7722
0.4979	16.8571	59	0.4529	0.8228
0.4957	18.0	63	0.4686	0.7975
0.4957	18.8571	66	0.5060	0.7722
0.3659	20.0	70	0.4821	0.7848
0.3659	20.8571	73	0.6116	0.7089
0.3659	22.0	77	0.5860	0.7215
0.2973	22.8571	80	0.7100	0.7089
0.2973	24.0	84	0.6446	0.7342
0.2973	24.8571	87	0.6294	0.7342
0.2647	26.0	91	0.5988	0.7342
0.2647	26.8571	94	0.5256	0.7342
0.2647	28.0	98	0.6628	0.7595
0.2527	28.8571	101	0.5054	0.7595
0.2527	30.0	105	0.7632	0.7595
0.2527	30.8571	108	0.5917	0.7848
0.2176	32.0	112	0.5293	0.7848
0.2176	32.8571	115	0.6048	0.7468
0.2176	34.0	119	0.5710	0.7468
0.1633	34.8571	122	0.5901	0.7595
0.1633	36.0	126	0.8161	0.7468
0.1633	36.8571	129	0.7202	0.7468
0.1753	38.0	133	0.8239	0.7215
0.1753	38.8571	136	0.8908	0.7215
0.1743	40.0	140	0.8519	0.7342
0.1743	40.8571	143	1.0071	0.7215
0.1743	42.0	147	0.7842	0.7342
0.1532	42.8571	150	0.7827	0.7089
0.1532	44.0	154	0.7150	0.7468
0.1532	44.8571	157	0.6905	0.7595
0.1526	46.0	161	0.9260	0.7089
0.1526	46.8571	164	0.7933	0.7595
0.1526	48.0	168	0.8580	0.7468
0.1519	48.8571	171	0.6899	0.7975
0.1519	50.0	175	0.7069	0.7848
0.1519	50.8571	178	0.6741	0.7595
0.1292	52.0	182	0.7183	0.7848
0.1292	52.8571	185	0.8051	0.7468
0.1292	54.0	189	0.6883	0.7722
0.1305	54.8571	192	0.8266	0.7468
0.1305	56.0	196	1.0871	0.7595
0.1305	56.8571	199	0.7595	0.7595
0.1129	58.0	203	0.6880	0.7595
0.1129	58.8571	206	1.0676	0.7595
0.1369	60.0	210	0.8078	0.7595
0.1369	60.8571	213	0.7850	0.7595
0.1369	62.0	217	0.6975	0.7722
0.127	62.8571	220	0.7212	0.7595
0.127	64.0	224	0.8967	0.7468
0.127	64.8571	227	1.0046	0.7595
0.1238	66.0	231	0.8611	0.7342
0.1238	66.8571	234	0.9676	0.7975
0.1238	68.0	238	1.3115	0.7215
0.1068	68.8571	241	1.0992	0.7468
0.1068	70.0	245	0.8765	0.7848
0.1068	70.8571	248	0.8510	0.7848
0.1019	72.0	252	0.7403	0.7975
0.1019	72.8571	255	0.7459	0.7975
0.1019	74.0	259	0.7705	0.7975
0.1002	74.8571	262	0.7535	0.7975
0.1002	76.0	266	0.7124	0.7722
0.1002	76.8571	269	0.7014	0.7342
0.1222	78.0	273	0.8068	0.7722
0.1222	78.8571	276	0.9451	0.7722
0.1091	80.0	280	1.0048	0.7848
0.1091	80.8571	283	0.9518	0.7722
0.1091	82.0	287	0.8575	0.7848
0.0957	82.8571	290	0.8441	0.7848
0.0957	84.0	294	0.8602	0.7848
0.0957	84.8571	297	0.8701	0.7848
0.1111	85.7143	300	0.8731	0.7848

Framework versions

Transformers 4.41.0
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

BilalMuftuoglu
/

deit-base-distilled-patch16-224-55-fold3

deit-base-distilled-patch16-224-55-fold3

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for BilalMuftuoglu/deit-base-distilled-patch16-224-55-fold3

Evaluation results