2024-01-12_one_stage_subgraphs_weighted_entropyreg_txt_vis_conc_6_ramp

This model is a fine-tuned version of microsoft/layoutlmv3-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4857
Accuracy: 0.77
Exit 0 Accuracy: 0.09
Exit 1 Accuracy: 0.7575

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 2
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 24
total_train_batch_size: 48
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Exit 0 Accuracy	Exit 1 Accuracy
No log	0.96	16	2.6866	0.13	0.05	0.0625
No log	1.98	33	2.5303	0.2175	0.035	0.0625
No log	3.0	50	2.3471	0.295	0.035	0.0625
No log	3.96	66	2.0891	0.3975	0.0475	0.0675
No log	4.98	83	1.7694	0.5475	0.0475	0.0675
No log	6.0	100	1.5006	0.6375	0.05	0.0875
No log	6.96	116	1.3571	0.68	0.0525	0.0875
No log	7.98	133	1.1444	0.7475	0.0525	0.115
No log	9.0	150	1.0465	0.73	0.055	0.1225
No log	9.96	166	0.9712	0.75	0.06	0.15
No log	10.98	183	0.9017	0.79	0.0675	0.16
No log	12.0	200	0.9028	0.7675	0.065	0.1925
No log	12.96	216	0.8929	0.78	0.065	0.21
No log	13.98	233	0.8808	0.7725	0.075	0.2825
No log	15.0	250	0.8962	0.7825	0.08	0.3075
No log	15.96	266	0.9893	0.7775	0.0825	0.3725
No log	16.98	283	1.0809	0.7475	0.0775	0.5
No log	18.0	300	0.9272	0.8	0.085	0.545
No log	18.96	316	1.1704	0.7475	0.0825	0.5875
No log	19.98	333	1.1274	0.7725	0.0825	0.6275
No log	21.0	350	1.1633	0.7525	0.0825	0.6375
No log	21.96	366	1.2537	0.76	0.085	0.6325
No log	22.98	383	1.2364	0.7575	0.085	0.645
No log	24.0	400	1.2045	0.7625	0.0875	0.66
No log	24.96	416	1.2786	0.7475	0.085	0.6575
No log	25.98	433	1.2697	0.77	0.0875	0.6775
No log	27.0	450	1.3530	0.7675	0.0825	0.7025
No log	27.96	466	1.3087	0.775	0.0825	0.7025
No log	28.98	483	1.4329	0.7375	0.085	0.7175
0.9714	30.0	500	1.3908	0.7575	0.085	0.71
0.9714	30.96	516	1.4018	0.765	0.085	0.7175
0.9714	31.98	533	1.3794	0.7775	0.0875	0.7
0.9714	33.0	550	1.4277	0.76	0.0875	0.725
0.9714	33.96	566	1.4728	0.7575	0.09	0.73
0.9714	34.98	583	1.3926	0.77	0.09	0.7375
0.9714	36.0	600	1.4474	0.76	0.085	0.7425
0.9714	36.96	616	1.4008	0.77	0.085	0.7475
0.9714	37.98	633	1.4678	0.7575	0.085	0.7425
0.9714	39.0	650	1.4913	0.7725	0.0875	0.745
0.9714	39.96	666	1.4628	0.77	0.09	0.745
0.9714	40.98	683	1.4442	0.7675	0.09	0.74
0.9714	42.0	700	1.4448	0.7725	0.0875	0.75
0.9714	42.96	716	1.5156	0.755	0.0875	0.7425
0.9714	43.98	733	1.4809	0.75	0.0875	0.7425
0.9714	45.0	750	1.5115	0.7475	0.0875	0.75
0.9714	45.96	766	1.4681	0.7675	0.0925	0.755
0.9714	46.98	783	1.5000	0.765	0.09	0.75
0.9714	48.0	800	1.4784	0.7725	0.0875	0.755
0.9714	48.96	816	1.4947	0.76	0.09	0.7525
0.9714	49.98	833	1.4752	0.76	0.0875	0.7525
0.9714	51.0	850	1.4891	0.7675	0.09	0.76
0.9714	51.96	866	1.4876	0.7675	0.09	0.75
0.9714	52.98	883	1.4789	0.7725	0.09	0.755
0.9714	54.0	900	1.4820	0.765	0.09	0.7575
0.9714	54.96	916	1.4797	0.775	0.09	0.7575
0.9714	55.98	933	1.4880	0.77	0.09	0.76
0.9714	57.0	950	1.4864	0.77	0.09	0.7575
0.9714	57.6	960	1.4857	0.77	0.09	0.7575

Framework versions

Transformers 4.31.0
Pytorch 2.0.1+cu117
Datasets 2.13.1
Tokenizers 0.13.3

Omar95farag
/

2024-01-12_one_stage_subgraphs_weighted_entropyreg_txt_vis_conc_6_ramp

2024-01-12_one_stage_subgraphs_weighted_entropyreg_txt_vis_conc_6_ramp

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Evaluation results

2024-01-12_one_stage_subgraphs_weighted_entropyreg_txt_vis_conc_6_ramp

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from microsoft/layoutlmv3-base

Evaluation results

Finetuned from