GUE_prom_prom_300_tata-seqsight_4096_512_27M-L8_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_300_tata dataset. It achieves the following results on the evaluation set:

Loss: 0.5410
F1 Score: 0.8141
Accuracy: 0.8140

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5215	5.13	200	0.4665	0.7858	0.7863
0.4283	10.26	400	0.4824	0.7973	0.7977
0.3852	15.38	600	0.4544	0.8040	0.8042
0.3475	20.51	800	0.4464	0.8141	0.8140
0.323	25.64	1000	0.4715	0.8158	0.8157
0.294	30.77	1200	0.4832	0.8060	0.8059
0.2763	35.9	1400	0.5299	0.8141	0.8140
0.2495	41.03	1600	0.5521	0.8010	0.8010
0.2361	46.15	1800	0.5793	0.8174	0.8173
0.2194	51.28	2000	0.6114	0.8092	0.8091
0.2016	56.41	2200	0.6572	0.8058	0.8059
0.1875	61.54	2400	0.7338	0.7920	0.7928
0.1662	66.67	2600	0.7151	0.7960	0.7961
0.1592	71.79	2800	0.7766	0.7927	0.7928
0.1501	76.92	3000	0.7609	0.7911	0.7912
0.1387	82.05	3200	0.8021	0.8043	0.8042
0.1329	87.18	3400	0.8527	0.7957	0.7961
0.1231	92.31	3600	0.8418	0.7994	0.7993
0.1156	97.44	3800	0.8410	0.7880	0.7879
0.116	102.56	4000	0.9420	0.7941	0.7945
0.1066	107.69	4200	0.9582	0.7907	0.7912
0.0997	112.82	4400	0.9930	0.7907	0.7912
0.0967	117.95	4600	0.9556	0.7861	0.7863
0.0908	123.08	4800	0.9752	0.7877	0.7879
0.0871	128.21	5000	0.9768	0.7910	0.7912
0.0894	133.33	5200	0.9933	0.7945	0.7945
0.0851	138.46	5400	0.9695	0.7911	0.7912
0.08	143.59	5600	1.1321	0.7791	0.7798
0.0799	148.72	5800	1.0871	0.7927	0.7928
0.0735	153.85	6000	1.1066	0.7880	0.7879
0.0709	158.97	6200	1.1187	0.7944	0.7945
0.0717	164.1	6400	1.0812	0.7928	0.7928
0.0709	169.23	6600	1.0957	0.7961	0.7961
0.069	174.36	6800	1.1046	0.7846	0.7847
0.0665	179.49	7000	1.1428	0.7877	0.7879
0.0661	184.62	7200	1.0884	0.7815	0.7814
0.0626	189.74	7400	1.1188	0.7944	0.7945
0.0621	194.87	7600	1.1021	0.7929	0.7928
0.0596	200.0	7800	1.1288	0.7864	0.7863
0.058	205.13	8000	1.1790	0.7862	0.7863
0.055	210.26	8200	1.2018	0.7878	0.7879
0.0579	215.38	8400	1.2147	0.7795	0.7798
0.0566	220.51	8600	1.1783	0.7831	0.7830
0.0552	225.64	8800	1.1750	0.7846	0.7847
0.0554	230.77	9000	1.1935	0.7879	0.7879
0.0531	235.9	9200	1.1895	0.7846	0.7847
0.0553	241.03	9400	1.1748	0.7831	0.7830
0.0523	246.15	9600	1.1992	0.7863	0.7863
0.0537	251.28	9800	1.2021	0.7879	0.7879
0.0538	256.41	10000	1.2038	0.7879	0.7879

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_tata-seqsight_4096_512_27M-L8_f

GUE_prom_prom_300_tata-seqsight_4096_512_27M-L8_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results