GUE_prom_prom_300_all-seqsight_32768_512_43M-L1_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_32768_512_43M on the mahdibaghbanzadeh/GUE_prom_prom_300_all dataset. It achieves the following results on the evaluation set:

Loss: 0.2119
F1 Score: 0.9145
Accuracy: 0.9145

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.4346	0.54	200	0.2868	0.8895	0.8895
0.2911	1.08	400	0.2578	0.8990	0.8990
0.2714	1.62	600	0.2389	0.9039	0.9039
0.2514	2.16	800	0.2377	0.9043	0.9044
0.2477	2.7	1000	0.2262	0.9061	0.9061
0.2379	3.24	1200	0.2297	0.9080	0.9081
0.2416	3.78	1400	0.2212	0.9102	0.9103
0.2327	4.32	1600	0.2150	0.9111	0.9111
0.2277	4.86	1800	0.2154	0.9120	0.9120
0.224	5.41	2000	0.2112	0.9142	0.9142
0.2231	5.95	2200	0.2120	0.9155	0.9155
0.2227	6.49	2400	0.2081	0.9155	0.9155
0.2201	7.03	2600	0.2055	0.9164	0.9164
0.2153	7.57	2800	0.2038	0.9177	0.9177
0.2176	8.11	3000	0.2018	0.9194	0.9194
0.2154	8.65	3200	0.2013	0.9193	0.9193
0.2099	9.19	3400	0.1997	0.9189	0.9189
0.2076	9.73	3600	0.1996	0.9187	0.9187
0.2161	10.27	3800	0.1973	0.9206	0.9206
0.2091	10.81	4000	0.1972	0.9206	0.9206
0.2112	11.35	4200	0.2030	0.9183	0.9184
0.2085	11.89	4400	0.1967	0.9208	0.9208
0.2041	12.43	4600	0.1979	0.9212	0.9213
0.2089	12.97	4800	0.1950	0.9211	0.9211
0.2047	13.51	5000	0.1969	0.9208	0.9208
0.2065	14.05	5200	0.1946	0.9223	0.9223
0.2033	14.59	5400	0.1977	0.9209	0.9209
0.2021	15.14	5600	0.1989	0.9212	0.9213
0.2004	15.68	5800	0.1977	0.9218	0.9218
0.2041	16.22	6000	0.2004	0.9197	0.9198
0.2004	16.76	6200	0.1956	0.9219	0.9220
0.2002	17.3	6400	0.1943	0.9198	0.9198
0.2044	17.84	6600	0.1946	0.9206	0.9206
0.1962	18.38	6800	0.1966	0.9221	0.9221
0.2041	18.92	7000	0.1957	0.9219	0.9220
0.201	19.46	7200	0.1931	0.9235	0.9235
0.1972	20.0	7400	0.1928	0.9223	0.9223
0.202	20.54	7600	0.1928	0.9240	0.9240
0.2	21.08	7800	0.1928	0.9236	0.9236
0.1977	21.62	8000	0.1944	0.9233	0.9233
0.198	22.16	8200	0.1929	0.9240	0.9240
0.1908	22.7	8400	0.1942	0.9241	0.9242
0.202	23.24	8600	0.1933	0.9231	0.9231
0.1959	23.78	8800	0.1932	0.9231	0.9231
0.2012	24.32	9000	0.1924	0.9235	0.9235
0.1952	24.86	9200	0.1923	0.9235	0.9235
0.195	25.41	9400	0.1928	0.9238	0.9238
0.1939	25.95	9600	0.1925	0.9231	0.9231
0.1969	26.49	9800	0.1940	0.9233	0.9233
0.1955	27.03	10000	0.1931	0.9233	0.9233

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_all-seqsight_32768_512_43M-L1_f

GUE_prom_prom_300_all-seqsight_32768_512_43M-L1_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results