GUE_prom_prom_300_all-seqsight_4096_512_27M-L1_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_300_all dataset. It achieves the following results on the evaluation set:

Loss: 0.2119
F1 Score: 0.9138
Accuracy: 0.9139

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.3648	0.54	200	0.2612	0.8949	0.8949
0.267	1.08	400	0.2421	0.9046	0.9046
0.2578	1.62	600	0.2327	0.9095	0.9095
0.241	2.16	800	0.2284	0.9121	0.9122
0.2369	2.7	1000	0.2228	0.9122	0.9122
0.2325	3.24	1200	0.2205	0.9150	0.9150
0.2298	3.78	1400	0.2147	0.9159	0.9159
0.2268	4.32	1600	0.2126	0.9162	0.9162
0.2181	4.86	1800	0.2131	0.9187	0.9187
0.2168	5.41	2000	0.2078	0.9204	0.9204
0.2148	5.95	2200	0.2081	0.9197	0.9198
0.2126	6.49	2400	0.2026	0.9233	0.9233
0.2109	7.03	2600	0.2017	0.9225	0.9225
0.2055	7.57	2800	0.2005	0.9231	0.9231
0.2081	8.11	3000	0.1986	0.9250	0.925
0.2072	8.65	3200	0.1968	0.9235	0.9235
0.1997	9.19	3400	0.1984	0.9238	0.9238
0.2	9.73	3600	0.1942	0.9255	0.9255
0.2062	10.27	3800	0.1926	0.9257	0.9257
0.2019	10.81	4000	0.1918	0.9247	0.9247
0.1989	11.35	4200	0.1949	0.9260	0.9260
0.1976	11.89	4400	0.1921	0.9252	0.9252
0.1981	12.43	4600	0.1902	0.9265	0.9265
0.1984	12.97	4800	0.1902	0.9250	0.925
0.1951	13.51	5000	0.1914	0.9260	0.9260
0.1977	14.05	5200	0.1885	0.9263	0.9264
0.1909	14.59	5400	0.1909	0.9268	0.9269
0.1932	15.14	5600	0.1888	0.9268	0.9269
0.1894	15.68	5800	0.1894	0.9245	0.9245
0.1935	16.22	6000	0.1893	0.9270	0.9270
0.1894	16.76	6200	0.1879	0.9272	0.9272
0.1914	17.3	6400	0.1878	0.9270	0.9270
0.1912	17.84	6600	0.1871	0.9257	0.9257
0.1875	18.38	6800	0.1873	0.9260	0.9260
0.1917	18.92	7000	0.1868	0.9279	0.9279
0.19	19.46	7200	0.1869	0.9260	0.9260
0.1865	20.0	7400	0.1863	0.9267	0.9267
0.1909	20.54	7600	0.1853	0.9274	0.9274
0.1864	21.08	7800	0.1853	0.9275	0.9275
0.1875	21.62	8000	0.1854	0.9265	0.9265
0.1866	22.16	8200	0.1852	0.9277	0.9277
0.1836	22.7	8400	0.1856	0.9277	0.9277
0.1888	23.24	8600	0.1851	0.9275	0.9275
0.1847	23.78	8800	0.1850	0.9269	0.9269
0.1903	24.32	9000	0.1850	0.9279	0.9279
0.1844	24.86	9200	0.1849	0.9274	0.9274
0.1842	25.41	9400	0.1852	0.9280	0.9280
0.1867	25.95	9600	0.1850	0.9282	0.9282
0.1848	26.49	9800	0.1848	0.9277	0.9277
0.1847	27.03	10000	0.1848	0.9279	0.9279

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_all-seqsight_4096_512_27M-L1_f

GUE_prom_prom_300_all-seqsight_4096_512_27M-L1_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results