GUE_prom_prom_300_tata-seqsight_4096_512_27M-L1_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_300_tata dataset. It achieves the following results on the evaluation set:

Loss: 0.4472
F1 Score: 0.8205
Accuracy: 0.8206

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5518	5.13	200	0.4928	0.7569	0.7586
0.4631	10.26	400	0.4878	0.7897	0.7896
0.4386	15.38	600	0.4768	0.8076	0.8075
0.4201	20.51	800	0.4712	0.8027	0.8026
0.4035	25.64	1000	0.4733	0.8026	0.8026
0.3933	30.77	1200	0.4505	0.8092	0.8091
0.3811	35.9	1400	0.4497	0.8124	0.8124
0.3708	41.03	1600	0.4433	0.8174	0.8173
0.3631	46.15	1800	0.4533	0.8124	0.8124
0.3507	51.28	2000	0.4587	0.8140	0.8140
0.3415	56.41	2200	0.4481	0.8207	0.8206
0.3361	61.54	2400	0.4627	0.8157	0.8157
0.3242	66.67	2600	0.4618	0.8256	0.8254
0.3196	71.79	2800	0.4573	0.8190	0.8189
0.322	76.92	3000	0.4850	0.7874	0.7879
0.3099	82.05	3200	0.4673	0.8060	0.8059
0.3063	87.18	3400	0.4822	0.7942	0.7945
0.2999	92.31	3600	0.4886	0.7960	0.7961
0.2946	97.44	3800	0.4813	0.8011	0.8010
0.2903	102.56	4000	0.4762	0.8060	0.8059
0.2864	107.69	4200	0.4895	0.8059	0.8059
0.2826	112.82	4400	0.4961	0.7977	0.7977
0.2788	117.95	4600	0.5237	0.7957	0.7961
0.2743	123.08	4800	0.5102	0.7961	0.7961
0.2709	128.21	5000	0.5084	0.7962	0.7961
0.2692	133.33	5200	0.5092	0.8027	0.8026
0.266	138.46	5400	0.5223	0.7927	0.7928
0.26	143.59	5600	0.5235	0.7897	0.7896
0.2608	148.72	5800	0.5211	0.7913	0.7912
0.256	153.85	6000	0.5216	0.7897	0.7896
0.253	158.97	6200	0.5403	0.7912	0.7912
0.2555	164.1	6400	0.5346	0.7913	0.7912
0.2486	169.23	6600	0.5558	0.7912	0.7912
0.2504	174.36	6800	0.5522	0.7895	0.7896
0.2473	179.49	7000	0.5415	0.7864	0.7863
0.2461	184.62	7200	0.5402	0.7848	0.7847
0.2428	189.74	7400	0.5548	0.7880	0.7879
0.2422	194.87	7600	0.5647	0.7846	0.7847
0.2416	200.0	7800	0.5449	0.7881	0.7879
0.2401	205.13	8000	0.5543	0.7881	0.7879
0.2352	210.26	8200	0.5685	0.7814	0.7814
0.2391	215.38	8400	0.5669	0.7798	0.7798
0.2321	220.51	8600	0.5624	0.7848	0.7847
0.232	225.64	8800	0.5806	0.7830	0.7830
0.2348	230.77	9000	0.5824	0.7814	0.7814
0.2305	235.9	9200	0.5787	0.7798	0.7798
0.2328	241.03	9400	0.5733	0.7831	0.7830
0.2313	246.15	9600	0.5741	0.7815	0.7814
0.2308	251.28	9800	0.5789	0.7749	0.7749
0.2307	256.41	10000	0.5788	0.7766	0.7765

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_tata-seqsight_4096_512_27M-L1_f

GUE_prom_prom_300_tata-seqsight_4096_512_27M-L1_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results