GUE_prom_prom_300_all-seqsight_32768_512_43M-L32_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_32768_512_43M on the mahdibaghbanzadeh/GUE_prom_prom_300_all dataset. It achieves the following results on the evaluation set:

Loss: 0.1981
F1 Score: 0.9235
Accuracy: 0.9235

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.3368	0.54	200	0.2353	0.9084	0.9084
0.2343	1.08	400	0.2030	0.9176	0.9176
0.2205	1.62	600	0.1989	0.9197	0.9198
0.209	2.16	800	0.1961	0.9209	0.9209
0.207	2.7	1000	0.1989	0.9149	0.9149
0.1983	3.24	1200	0.1933	0.9184	0.9184
0.1988	3.78	1400	0.1986	0.9192	0.9193
0.1943	4.32	1600	0.1880	0.9255	0.9255
0.1883	4.86	1800	0.1852	0.9248	0.9248
0.182	5.41	2000	0.1877	0.9265	0.9265
0.1841	5.95	2200	0.1843	0.9263	0.9264
0.1817	6.49	2400	0.1895	0.9239	0.9240
0.1795	7.03	2600	0.1829	0.9270	0.9270
0.1726	7.57	2800	0.1849	0.9267	0.9267
0.1723	8.11	3000	0.1821	0.9287	0.9287
0.1686	8.65	3200	0.1881	0.9278	0.9279
0.1656	9.19	3400	0.1821	0.9282	0.9282
0.1605	9.73	3600	0.1768	0.9291	0.9291
0.1656	10.27	3800	0.1778	0.9289	0.9289
0.1606	10.81	4000	0.1741	0.9316	0.9316
0.1594	11.35	4200	0.1806	0.9309	0.9309
0.1563	11.89	4400	0.1826	0.9305	0.9306
0.1554	12.43	4600	0.1727	0.9323	0.9323
0.1513	12.97	4800	0.1741	0.9285	0.9285
0.1481	13.51	5000	0.1776	0.9297	0.9297
0.1486	14.05	5200	0.1869	0.9218	0.9218
0.1429	14.59	5400	0.1801	0.9304	0.9304
0.1445	15.14	5600	0.1792	0.9316	0.9316
0.1408	15.68	5800	0.1781	0.9304	0.9304
0.1408	16.22	6000	0.1751	0.9301	0.9301
0.1352	16.76	6200	0.1871	0.9263	0.9264
0.138	17.3	6400	0.1750	0.9294	0.9294
0.1358	17.84	6600	0.1777	0.9323	0.9323
0.1315	18.38	6800	0.1856	0.9299	0.9299
0.1369	18.92	7000	0.1762	0.9316	0.9316
0.1321	19.46	7200	0.1793	0.9306	0.9306
0.1311	20.0	7400	0.1807	0.9334	0.9334
0.1323	20.54	7600	0.1799	0.9306	0.9306
0.1272	21.08	7800	0.1808	0.9307	0.9307
0.1237	21.62	8000	0.1877	0.9280	0.9280
0.1246	22.16	8200	0.1837	0.9302	0.9302
0.122	22.7	8400	0.1848	0.9301	0.9301
0.1236	23.24	8600	0.1878	0.9299	0.9299
0.1224	23.78	8800	0.1875	0.9294	0.9294
0.1232	24.32	9000	0.1848	0.9304	0.9304
0.1228	24.86	9200	0.1844	0.9307	0.9307
0.1188	25.41	9400	0.1856	0.9299	0.9299
0.12	25.95	9600	0.1847	0.9316	0.9316
0.1195	26.49	9800	0.1859	0.9309	0.9309
0.1165	27.03	10000	0.1854	0.9318	0.9318

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_all-seqsight_32768_512_43M-L32_f

GUE_prom_prom_300_all-seqsight_32768_512_43M-L32_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results