GUE_tf_2-seqsight_65536_512_94M-L8_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_94M on the mahdibaghbanzadeh/GUE_tf_2 dataset. It achieves the following results on the evaluation set:

Loss: 0.4623
F1 Score: 0.7958
Accuracy: 0.796

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5547	1.34	200	0.5189	0.7320	0.733
0.5143	2.68	400	0.5054	0.7519	0.752
0.505	4.03	600	0.4933	0.752	0.752
0.498	5.37	800	0.4858	0.7590	0.759
0.4907	6.71	1000	0.4973	0.7554	0.756
0.4832	8.05	1200	0.4856	0.7590	0.759
0.4792	9.4	1400	0.4860	0.7558	0.756
0.4773	10.74	1600	0.4819	0.7620	0.762
0.47	12.08	1800	0.4846	0.7680	0.768
0.4641	13.42	2000	0.4867	0.7600	0.76
0.464	14.77	2200	0.4889	0.7455	0.747
0.4555	16.11	2400	0.4929	0.7569	0.757
0.4551	17.45	2600	0.4876	0.7568	0.757
0.4448	18.79	2800	0.4858	0.7500	0.75
0.4458	20.13	3000	0.4846	0.7670	0.767
0.4419	21.48	3200	0.5008	0.7497	0.75
0.4395	22.82	3400	0.4946	0.7540	0.754
0.4326	24.16	3600	0.4901	0.7529	0.753
0.4296	25.5	3800	0.4934	0.7570	0.757
0.4307	26.85	4000	0.4928	0.7640	0.764
0.4233	28.19	4200	0.5084	0.7604	0.761
0.4223	29.53	4400	0.5101	0.7567	0.757
0.4149	30.87	4600	0.4971	0.7650	0.765
0.415	32.21	4800	0.5112	0.7590	0.759
0.4119	33.56	5000	0.5082	0.7526	0.753
0.4112	34.9	5200	0.5050	0.7668	0.767
0.4046	36.24	5400	0.5079	0.7660	0.766
0.4049	37.58	5600	0.5065	0.7600	0.76
0.4026	38.93	5800	0.5062	0.7680	0.768
0.3966	40.27	6000	0.5045	0.7649	0.765
0.3957	41.61	6200	0.5080	0.7630	0.763
0.3998	42.95	6400	0.5174	0.7609	0.761
0.3918	44.3	6600	0.5150	0.7620	0.762
0.3923	45.64	6800	0.5214	0.7606	0.761
0.3911	46.98	7000	0.5116	0.7639	0.764
0.3877	48.32	7200	0.5238	0.7670	0.767
0.3821	49.66	7400	0.5308	0.7568	0.757
0.3829	51.01	7600	0.5337	0.7545	0.755
0.3808	52.35	7800	0.5226	0.7610	0.761
0.3837	53.69	8000	0.5177	0.7630	0.763
0.3785	55.03	8200	0.5215	0.7629	0.763
0.381	56.38	8400	0.5212	0.7629	0.763
0.3762	57.72	8600	0.5233	0.7679	0.768
0.3761	59.06	8800	0.5251	0.7608	0.761
0.3743	60.4	9000	0.5303	0.7669	0.767
0.3782	61.74	9200	0.5235	0.7629	0.763
0.3772	63.09	9400	0.5264	0.7638	0.764
0.3742	64.43	9600	0.5234	0.766	0.766
0.3733	65.77	9800	0.5297	0.7618	0.762
0.3728	67.11	10000	0.5271	0.7659	0.766

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_2-seqsight_65536_512_94M-L8_f

GUE_tf_2-seqsight_65536_512_94M-L8_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results