GUE_tf_2-seqsight_65536_512_94M-L1_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_94M on the mahdibaghbanzadeh/GUE_tf_2 dataset. It achieves the following results on the evaluation set:

Loss: 0.4450
F1 Score: 0.7889
Accuracy: 0.789

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5677	1.34	200	0.5339	0.7190	0.72
0.5269	2.68	400	0.5186	0.7380	0.738
0.5186	4.03	600	0.5087	0.7450	0.745
0.5143	5.37	800	0.5039	0.7480	0.748
0.5077	6.71	1000	0.5009	0.7488	0.749
0.5016	8.05	1200	0.5032	0.7488	0.749
0.4998	9.4	1400	0.4977	0.7479	0.748
0.5008	10.74	1600	0.4922	0.7549	0.755
0.495	12.08	1800	0.5029	0.7517	0.752
0.4929	13.42	2000	0.4995	0.7566	0.757
0.4935	14.77	2200	0.4924	0.7476	0.748
0.4874	16.11	2400	0.4991	0.7577	0.758
0.4884	17.45	2600	0.4912	0.7530	0.753
0.4814	18.79	2800	0.4909	0.7530	0.753
0.4835	20.13	3000	0.4899	0.7589	0.759
0.4825	21.48	3200	0.4994	0.7497	0.75
0.4846	22.82	3400	0.4949	0.7558	0.756
0.4781	24.16	3600	0.4874	0.7540	0.754
0.4749	25.5	3800	0.4891	0.7570	0.757
0.4786	26.85	4000	0.4881	0.7540	0.754
0.4744	28.19	4200	0.4906	0.7549	0.755
0.474	29.53	4400	0.4962	0.7579	0.758
0.4724	30.87	4600	0.4927	0.7548	0.755
0.4745	32.21	4800	0.4919	0.7510	0.751
0.4709	33.56	5000	0.4947	0.7566	0.757
0.4719	34.9	5200	0.4936	0.7499	0.75
0.4691	36.24	5400	0.4891	0.7540	0.754
0.4699	37.58	5600	0.4887	0.7520	0.752
0.4665	38.93	5800	0.4890	0.7510	0.751
0.4656	40.27	6000	0.4876	0.7510	0.751
0.4662	41.61	6200	0.4930	0.7510	0.751
0.4668	42.95	6400	0.4954	0.7569	0.757
0.4659	44.3	6600	0.4934	0.7539	0.754
0.4662	45.64	6800	0.4956	0.7555	0.756
0.4625	46.98	7000	0.4910	0.7520	0.752
0.4645	48.32	7200	0.4944	0.7549	0.755
0.4607	49.66	7400	0.4919	0.7530	0.753
0.4585	51.01	7600	0.4934	0.7529	0.753
0.4608	52.35	7800	0.4927	0.7530	0.753
0.4645	53.69	8000	0.4904	0.7530	0.753
0.4593	55.03	8200	0.4897	0.7489	0.749
0.4612	56.38	8400	0.4937	0.7538	0.754
0.4616	57.72	8600	0.4885	0.7570	0.757
0.4587	59.06	8800	0.4915	0.7530	0.753
0.4597	60.4	9000	0.4929	0.7549	0.755
0.4615	61.74	9200	0.4896	0.7510	0.751
0.4606	63.09	9400	0.4911	0.7549	0.755
0.4577	64.43	9600	0.4904	0.7510	0.751
0.4619	65.77	9800	0.4915	0.7559	0.756
0.4585	67.11	10000	0.4903	0.7530	0.753

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_2-seqsight_65536_512_94M-L1_f

GUE_tf_2-seqsight_65536_512_94M-L1_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results