Human-Action-Recognition-VIT-Base-patch16-224

This model is a fine-tuned version of google/vit-base-patch16-224 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 256
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training Loss	Epoch	Step	Validation Loss	Accuracy
10.2084	1.0	40	2.0027	0.4877
5.7018	2.0	80	0.7764	0.7774
3.1984	3.0	120	0.5612	0.8329
2.6944	4.0	160	0.5205	0.8437
2.4232	5.0	200	0.4874	0.8508
2.2387	6.0	240	0.4712	0.8567
2.0735	7.0	280	0.4715	0.8552
1.9519	8.0	320	0.4472	0.8587
1.8481	9.0	360	0.4504	0.8563
1.6348	10.0	400	0.4512	0.8583
1.6713	11.0	440	0.4621	0.8579
1.5573	12.0	480	0.4380	0.8659
1.5445	13.0	520	0.4347	0.8635
1.4436	14.0	560	0.4385	0.8683
1.388	15.0	600	0.4379	0.8679
1.4061	16.0	640	0.4391	0.8647
1.3256	17.0	680	0.4353	0.8671
1.3634	18.0	720	0.4360	0.8671
1.3661	19.0	760	0.4366	0.8679
1.3606	19.5063	780	0.4367	0.8687

Safetensors

Model size

85.8M params

Tensor type

F32

Base model

Finetuned

this model