GUE_prom_prom_300_all-seqsight_4096_512_27M-L8_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_300_all dataset. It achieves the following results on the evaluation set:

Loss: 0.1978
F1 Score: 0.9221
Accuracy: 0.9221

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.336	0.54	200	0.2405	0.9049	0.9049
0.2443	1.08	400	0.2229	0.9148	0.9149
0.2309	1.62	600	0.2086	0.9189	0.9189
0.2149	2.16	800	0.2024	0.9236	0.9236
0.2108	2.7	1000	0.1962	0.9206	0.9206
0.2042	3.24	1200	0.1978	0.9223	0.9223
0.2021	3.78	1400	0.1917	0.9221	0.9221
0.201	4.32	1600	0.1921	0.9248	0.9248
0.1925	4.86	1800	0.2013	0.9230	0.9230
0.1907	5.41	2000	0.1940	0.9240	0.9240
0.1877	5.95	2200	0.1855	0.9289	0.9289
0.187	6.49	2400	0.1814	0.9302	0.9302
0.1847	7.03	2600	0.1867	0.9267	0.9267
0.178	7.57	2800	0.1858	0.9275	0.9275
0.1824	8.11	3000	0.1864	0.9285	0.9285
0.1798	8.65	3200	0.1816	0.9296	0.9296
0.172	9.19	3400	0.1882	0.9265	0.9265
0.1734	9.73	3600	0.1801	0.9294	0.9294
0.1789	10.27	3800	0.1785	0.9304	0.9304
0.1748	10.81	4000	0.1793	0.9323	0.9323
0.1704	11.35	4200	0.1770	0.9323	0.9323
0.168	11.89	4400	0.1797	0.9323	0.9323
0.1686	12.43	4600	0.1743	0.9336	0.9336
0.1664	12.97	4800	0.1727	0.9324	0.9324
0.1642	13.51	5000	0.1791	0.9324	0.9324
0.1653	14.05	5200	0.1755	0.9304	0.9304
0.1596	14.59	5400	0.1759	0.9312	0.9313
0.1606	15.14	5600	0.1744	0.9338	0.9338
0.1563	15.68	5800	0.1790	0.9307	0.9307
0.1631	16.22	6000	0.1746	0.9307	0.9307
0.1565	16.76	6200	0.1747	0.9331	0.9331
0.1579	17.3	6400	0.1746	0.9343	0.9343
0.1591	17.84	6600	0.1721	0.9336	0.9336
0.1522	18.38	6800	0.1761	0.9336	0.9336
0.1571	18.92	7000	0.1733	0.9345	0.9345
0.1558	19.46	7200	0.1752	0.9333	0.9333
0.1512	20.0	7400	0.1746	0.9345	0.9345
0.1563	20.54	7600	0.1724	0.9340	0.9340
0.1512	21.08	7800	0.1714	0.9343	0.9343
0.1486	21.62	8000	0.1745	0.9343	0.9343
0.1496	22.16	8200	0.1735	0.9340	0.9340
0.1485	22.7	8400	0.1732	0.9350	0.9350
0.1511	23.24	8600	0.1735	0.9341	0.9341
0.1485	23.78	8800	0.1741	0.9343	0.9343
0.1524	24.32	9000	0.1738	0.9338	0.9338
0.1468	24.86	9200	0.1729	0.9358	0.9358
0.1482	25.41	9400	0.1743	0.9346	0.9346
0.1482	25.95	9600	0.1731	0.9343	0.9343
0.1472	26.49	9800	0.1729	0.9345	0.9345
0.1457	27.03	10000	0.1730	0.9343	0.9343

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_all-seqsight_4096_512_27M-L8_f

GUE_prom_prom_300_all-seqsight_4096_512_27M-L8_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results