GUE_prom_prom_300_all-seqsight_4096_512_27M-L32_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_300_all dataset. It achieves the following results on the evaluation set:

Loss: 0.2070
F1 Score: 0.9236
Accuracy: 0.9236

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.3111	0.54	200	0.2260	0.9121	0.9122
0.2266	1.08	400	0.2086	0.9194	0.9194
0.2153	1.62	600	0.2003	0.9220	0.9220
0.202	2.16	800	0.1943	0.9234	0.9235
0.1989	2.7	1000	0.1850	0.9277	0.9277
0.1927	3.24	1200	0.1920	0.9238	0.9238
0.1883	3.78	1400	0.1792	0.9299	0.9299
0.1866	4.32	1600	0.1842	0.9287	0.9287
0.1778	4.86	1800	0.1843	0.9287	0.9287
0.1729	5.41	2000	0.1870	0.9282	0.9282
0.1718	5.95	2200	0.1780	0.9318	0.9318
0.1692	6.49	2400	0.1733	0.9321	0.9321
0.1674	7.03	2600	0.1780	0.9331	0.9331
0.1588	7.57	2800	0.1773	0.9323	0.9323
0.1627	8.11	3000	0.1867	0.9260	0.9260
0.1571	8.65	3200	0.1735	0.9336	0.9336
0.1501	9.19	3400	0.1852	0.9299	0.9299
0.1521	9.73	3600	0.1736	0.9316	0.9316
0.1544	10.27	3800	0.1776	0.9317	0.9318
0.1517	10.81	4000	0.1773	0.9299	0.9299
0.1442	11.35	4200	0.1826	0.9272	0.9272
0.1449	11.89	4400	0.1754	0.9319	0.9319
0.1438	12.43	4600	0.1752	0.9323	0.9323
0.1383	12.97	4800	0.1709	0.9345	0.9345
0.1361	13.51	5000	0.1925	0.9280	0.9280
0.1364	14.05	5200	0.1788	0.9302	0.9302
0.1295	14.59	5400	0.1764	0.9351	0.9351
0.1317	15.14	5600	0.1761	0.9353	0.9353
0.1278	15.68	5800	0.1838	0.9311	0.9311
0.1305	16.22	6000	0.1764	0.9356	0.9356
0.1266	16.76	6200	0.1755	0.9334	0.9334
0.1262	17.3	6400	0.1762	0.9339	0.9340
0.1265	17.84	6600	0.1717	0.9353	0.9353
0.1197	18.38	6800	0.1792	0.9345	0.9345
0.1227	18.92	7000	0.1753	0.9350	0.9350
0.1196	19.46	7200	0.1785	0.9353	0.9353
0.1157	20.0	7400	0.1808	0.9338	0.9338
0.1201	20.54	7600	0.1810	0.9350	0.9350
0.1175	21.08	7800	0.1755	0.9360	0.9360
0.1099	21.62	8000	0.1809	0.9360	0.9360
0.1137	22.16	8200	0.1809	0.9350	0.9350
0.1116	22.7	8400	0.1790	0.9348	0.9348
0.1111	23.24	8600	0.1809	0.9356	0.9356
0.1122	23.78	8800	0.1831	0.9361	0.9361
0.1142	24.32	9000	0.1820	0.9336	0.9336
0.1078	24.86	9200	0.1822	0.9350	0.9350
0.1091	25.41	9400	0.1845	0.9341	0.9341
0.1086	25.95	9600	0.1838	0.9334	0.9334
0.1097	26.49	9800	0.1827	0.9343	0.9343
0.1059	27.03	10000	0.1825	0.9350	0.9350

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_all-seqsight_4096_512_27M-L32_f

GUE_prom_prom_300_all-seqsight_4096_512_27M-L32_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results