GUE_prom_prom_300_all-seqsight_32768_512_43M-L8_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_32768_512_43M on the mahdibaghbanzadeh/GUE_prom_prom_300_all dataset. It achieves the following results on the evaluation set:

Loss: 0.2006
F1 Score: 0.9216
Accuracy: 0.9216

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.3689	0.54	200	0.2509	0.9032	0.9032
0.2545	1.08	400	0.2269	0.9081	0.9081
0.2364	1.62	600	0.2112	0.9159	0.9159
0.2203	2.16	800	0.2049	0.9203	0.9203
0.2183	2.7	1000	0.2038	0.9164	0.9164
0.2107	3.24	1200	0.2041	0.9177	0.9177
0.2129	3.78	1400	0.2001	0.9182	0.9182
0.206	4.32	1600	0.1946	0.9220	0.9220
0.2031	4.86	1800	0.1933	0.9230	0.9230
0.199	5.41	2000	0.2003	0.9199	0.9199
0.1979	5.95	2200	0.1933	0.9231	0.9231
0.1985	6.49	2400	0.1892	0.9228	0.9228
0.1966	7.03	2600	0.1923	0.9253	0.9253
0.1907	7.57	2800	0.1905	0.9248	0.9248
0.1936	8.11	3000	0.1867	0.9265	0.9265
0.1901	8.65	3200	0.1891	0.9243	0.9243
0.1872	9.19	3400	0.1878	0.9247	0.9247
0.183	9.73	3600	0.1841	0.9255	0.9255
0.1901	10.27	3800	0.1859	0.9236	0.9236
0.1842	10.81	4000	0.1845	0.9277	0.9277
0.1845	11.35	4200	0.1855	0.9274	0.9274
0.1827	11.89	4400	0.1856	0.9262	0.9262
0.1807	12.43	4600	0.1813	0.9270	0.9270
0.1798	12.97	4800	0.1835	0.9265	0.9265
0.178	13.51	5000	0.1861	0.9272	0.9272
0.1787	14.05	5200	0.1860	0.9235	0.9235
0.1745	14.59	5400	0.1862	0.9275	0.9275
0.175	15.14	5600	0.1869	0.9262	0.9262
0.1725	15.68	5800	0.1846	0.9231	0.9231
0.1746	16.22	6000	0.1852	0.9258	0.9258
0.1702	16.76	6200	0.1853	0.9257	0.9257
0.1717	17.3	6400	0.1836	0.9260	0.9260
0.1738	17.84	6600	0.1820	0.9294	0.9294
0.1663	18.38	6800	0.1842	0.9235	0.9235
0.1726	18.92	7000	0.1802	0.9279	0.9279
0.1699	19.46	7200	0.1822	0.9272	0.9272
0.167	20.0	7400	0.1822	0.9289	0.9289
0.1712	20.54	7600	0.1813	0.9290	0.9291
0.1678	21.08	7800	0.1805	0.9289	0.9289
0.1652	21.62	8000	0.1828	0.9299	0.9299
0.1651	22.16	8200	0.1817	0.9274	0.9274
0.16	22.7	8400	0.1859	0.9258	0.9258
0.1684	23.24	8600	0.1830	0.9284	0.9284
0.1641	23.78	8800	0.1836	0.9262	0.9262
0.1684	24.32	9000	0.1815	0.9269	0.9269
0.1609	24.86	9200	0.1823	0.9274	0.9274
0.1624	25.41	9400	0.1812	0.9274	0.9274
0.1616	25.95	9600	0.1819	0.9277	0.9277
0.1634	26.49	9800	0.1821	0.9284	0.9284
0.1601	27.03	10000	0.1819	0.9284	0.9284

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_all-seqsight_32768_512_43M-L8_f

GUE_prom_prom_300_all-seqsight_32768_512_43M-L8_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results