deit-base-distilled-patch16-224-65-fold1

This model is a fine-tuned version of facebook/deit-base-distilled-patch16-224 on the imagefolder dataset. It achieves the following results on the evaluation set:

Loss: 0.3816
Accuracy: 0.8732

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.9231	3	0.7888	0.4930
No log	1.8462	6	0.7159	0.5070
No log	2.7692	9	0.7091	0.5070
0.7703	4.0	13	0.6908	0.5352
0.7703	4.9231	16	0.6527	0.6197
0.7703	5.8462	19	0.6236	0.7324
0.6435	6.7692	22	0.6357	0.6901
0.6435	8.0	26	0.5442	0.7042
0.6435	8.9231	29	0.5449	0.7183
0.5366	9.8462	32	0.5124	0.7465
0.5366	10.7692	35	0.5029	0.7042
0.5366	12.0	39	0.5486	0.7183
0.4577	12.9231	42	0.5394	0.6761
0.4577	13.8462	45	0.5511	0.7465
0.4577	14.7692	48	0.5794	0.6901
0.4187	16.0	52	0.5368	0.7324
0.4187	16.9231	55	0.4678	0.7887
0.4187	17.8462	58	0.6597	0.7042
0.3542	18.7692	61	0.4969	0.8169
0.3542	20.0	65	0.7103	0.7324
0.3542	20.9231	68	0.4979	0.7606
0.3057	21.8462	71	0.5271	0.7324
0.3057	22.7692	74	0.5357	0.7746
0.3057	24.0	78	0.4847	0.7887
0.2816	24.9231	81	0.5425	0.8310
0.2816	25.8462	84	0.5239	0.8028
0.2816	26.7692	87	0.4141	0.8310
0.2881	28.0	91	0.4997	0.8028
0.2881	28.9231	94	0.4216	0.8028
0.2881	29.8462	97	0.4668	0.7887
0.2421	30.7692	100	0.5904	0.7887
0.2421	32.0	104	0.5240	0.7746
0.2421	32.9231	107	0.9937	0.7606
0.2402	33.8462	110	0.4989	0.8028
0.2402	34.7692	113	0.7232	0.7887
0.2402	36.0	117	0.4815	0.8451
0.1862	36.9231	120	0.7431	0.7746
0.1862	37.8462	123	0.4434	0.8028
0.1862	38.7692	126	0.4760	0.7887
0.1783	40.0	130	0.5006	0.7887
0.1783	40.9231	133	0.4986	0.7887
0.1783	41.8462	136	0.7947	0.7887
0.1783	42.7692	139	0.4897	0.8310
0.1685	44.0	143	0.7500	0.7606
0.1685	44.9231	146	0.6053	0.7887
0.1685	45.8462	149	0.4777	0.8169
0.1779	46.7692	152	0.5800	0.7746
0.1779	48.0	156	0.4681	0.8451
0.1779	48.9231	159	0.7729	0.8028
0.1502	49.8462	162	0.6487	0.8028
0.1502	50.7692	165	0.5224	0.8169
0.1502	52.0	169	0.7017	0.8028
0.1586	52.9231	172	0.6034	0.8028
0.1586	53.8462	175	0.5791	0.8028
0.1586	54.7692	178	0.5651	0.8169
0.134	56.0	182	0.4862	0.8028
0.134	56.9231	185	0.6751	0.8169
0.134	57.8462	188	0.5925	0.8169
0.1602	58.7692	191	0.3982	0.8451
0.1602	60.0	195	0.5969	0.7887
0.1602	60.9231	198	0.5721	0.7887
0.1217	61.8462	201	0.3816	0.8732
0.1217	62.7692	204	0.4110	0.8310
0.1217	64.0	208	0.6716	0.7887
0.1274	64.9231	211	0.3499	0.8732
0.1274	65.8462	214	0.3671	0.8169
0.1274	66.7692	217	0.5318	0.7887
0.1277	68.0	221	0.6734	0.7887
0.1277	68.9231	224	0.4726	0.8028
0.1277	69.8462	227	0.4311	0.8169
0.1232	70.7692	230	0.7072	0.7746
0.1232	72.0	234	0.5859	0.7887
0.1232	72.9231	237	0.3758	0.8310
0.1293	73.8462	240	0.3673	0.8451
0.1293	74.7692	243	0.3673	0.8592
0.1293	76.0	247	0.4752	0.8169
0.1117	76.9231	250	0.4450	0.8310
0.1117	77.8462	253	0.4437	0.8451
0.1117	78.7692	256	0.4330	0.8310
0.1092	80.0	260	0.5095	0.8169
0.1092	80.9231	263	0.4948	0.8169
0.1092	81.8462	266	0.4135	0.8592
0.1092	82.7692	269	0.4190	0.8451
0.1151	84.0	273	0.4194	0.8732
0.1151	84.9231	276	0.4356	0.8310
0.1151	85.8462	279	0.4623	0.8028
0.1085	86.7692	282	0.4845	0.8310
0.1085	88.0	286	0.4998	0.8169
0.1085	88.9231	289	0.5181	0.8028
0.0908	89.8462	292	0.5373	0.8169
0.0908	90.7692	295	0.5465	0.8169
0.0908	92.0	299	0.5422	0.8169
0.0902	92.3077	300	0.5417	0.8169

Framework versions

Transformers 4.41.0
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

BilalMuftuoglu
/

deit-base-distilled-patch16-224-65-fold1

deit-base-distilled-patch16-224-65-fold1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Evaluation results

deit-base-distilled-patch16-224-65-fold1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from facebook/deit-base-distilled-patch16-224

Evaluation results

Finetuned from