GUE_prom_prom_300_tata-seqsight_4096_512_27M-L32_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_300_tata dataset. It achieves the following results on the evaluation set:

Loss: 0.4581
F1 Score: 0.8108
Accuracy: 0.8108

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.4977	5.13	200	0.4505	0.8125	0.8124
0.3948	10.26	400	0.5319	0.7643	0.7667
0.3334	15.38	600	0.4822	0.7978	0.7977
0.2789	20.51	800	0.5067	0.8044	0.8042
0.2299	25.64	1000	0.6173	0.8027	0.8026
0.19	30.77	1200	0.7005	0.8041	0.8042
0.1636	35.9	1400	0.7570	0.7990	0.7993
0.1285	41.03	1600	0.8049	0.7930	0.7928
0.1119	46.15	1800	0.9574	0.7823	0.7830
0.09	51.28	2000	0.9093	0.8043	0.8042
0.0883	56.41	2200	0.9730	0.7827	0.7830
0.0696	61.54	2400	1.1484	0.7893	0.7896
0.0625	66.67	2600	1.0474	0.7767	0.7765
0.0536	71.79	2800	1.1731	0.7863	0.7863
0.0544	76.92	3000	1.0924	0.7897	0.7896
0.0466	82.05	3200	1.2232	0.7909	0.7912
0.0466	87.18	3400	1.1918	0.7879	0.7879
0.044	92.31	3600	1.1418	0.8027	0.8026
0.0413	97.44	3800	1.1120	0.7848	0.7847
0.041	102.56	4000	1.2203	0.7880	0.7879
0.0366	107.69	4200	1.2529	0.7913	0.7912
0.0354	112.82	4400	1.2677	0.7815	0.7814
0.0338	117.95	4600	1.3405	0.7878	0.7879
0.0293	123.08	4800	1.3398	0.7731	0.7732
0.0314	128.21	5000	1.2806	0.7864	0.7863
0.0318	133.33	5200	1.2921	0.7946	0.7945
0.0269	138.46	5400	1.3859	0.7962	0.7961
0.0277	143.59	5600	1.3161	0.7930	0.7928
0.024	148.72	5800	1.4195	0.7897	0.7896
0.0227	153.85	6000	1.4223	0.7798	0.7798
0.0238	158.97	6200	1.4175	0.7929	0.7928
0.0212	164.1	6400	1.4446	0.7799	0.7798
0.0218	169.23	6600	1.4048	0.7881	0.7879
0.022	174.36	6800	1.5152	0.7812	0.7814
0.0194	179.49	7000	1.4982	0.7864	0.7863
0.0186	184.62	7200	1.4678	0.7946	0.7945
0.0183	189.74	7400	1.5020	0.7880	0.7879
0.0182	194.87	7600	1.5340	0.7880	0.7879
0.0171	200.0	7800	1.4942	0.7930	0.7928
0.0167	205.13	8000	1.4875	0.7913	0.7912
0.0171	210.26	8200	1.5960	0.7927	0.7928
0.016	215.38	8400	1.6081	0.7945	0.7945
0.0142	220.51	8600	1.5778	0.7881	0.7879
0.014	225.64	8800	1.5685	0.7913	0.7912
0.015	230.77	9000	1.6522	0.7863	0.7863
0.0137	235.9	9200	1.6601	0.7896	0.7896
0.0151	241.03	9400	1.5928	0.7897	0.7896
0.0141	246.15	9600	1.5832	0.7881	0.7879
0.0138	251.28	9800	1.6047	0.7929	0.7928
0.0122	256.41	10000	1.6062	0.7929	0.7928

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_tata-seqsight_4096_512_27M-L32_f

GUE_prom_prom_300_tata-seqsight_4096_512_27M-L32_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results