GUE_prom_prom_300_all-seqsight_16384_512_34M-L1_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_16384_512_34M on the mahdibaghbanzadeh/GUE_prom_prom_300_all dataset. It achieves the following results on the evaluation set:

Loss: 0.2161
F1 Score: 0.9122
Accuracy: 0.9122

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.4298	0.54	200	0.3156	0.8804	0.8806
0.3062	1.08	400	0.2651	0.8976	0.8976
0.2825	1.62	600	0.2513	0.8980	0.8980
0.2626	2.16	800	0.2415	0.9006	0.9007
0.2555	2.7	1000	0.2399	0.9015	0.9015
0.2461	3.24	1200	0.2334	0.9073	0.9073
0.247	3.78	1400	0.2271	0.9081	0.9081
0.2428	4.32	1600	0.2244	0.9098	0.9098
0.2331	4.86	1800	0.2285	0.9090	0.9090
0.2364	5.41	2000	0.2229	0.9108	0.9108
0.2315	5.95	2200	0.2170	0.9128	0.9128
0.2308	6.49	2400	0.2153	0.9128	0.9128
0.2314	7.03	2600	0.2169	0.9113	0.9113
0.2254	7.57	2800	0.2162	0.9118	0.9118
0.2245	8.11	3000	0.2194	0.9105	0.9105
0.2262	8.65	3200	0.2221	0.9082	0.9083
0.2168	9.19	3400	0.2145	0.9113	0.9113
0.2161	9.73	3600	0.2171	0.9103	0.9103
0.222	10.27	3800	0.2090	0.9123	0.9123
0.2151	10.81	4000	0.2075	0.9132	0.9132
0.2189	11.35	4200	0.2056	0.9130	0.9130
0.2134	11.89	4400	0.2111	0.9142	0.9142
0.2142	12.43	4600	0.2061	0.9130	0.9130
0.2152	12.97	4800	0.2049	0.9130	0.9130
0.2127	13.51	5000	0.2060	0.9130	0.9130
0.2161	14.05	5200	0.2043	0.9139	0.9139
0.2086	14.59	5400	0.2026	0.9132	0.9132
0.2084	15.14	5600	0.2016	0.9135	0.9135
0.2067	15.68	5800	0.2036	0.9132	0.9132
0.2126	16.22	6000	0.2016	0.9132	0.9132
0.206	16.76	6200	0.2040	0.9145	0.9145
0.207	17.3	6400	0.2054	0.9145	0.9145
0.2105	17.84	6600	0.2028	0.9139	0.9139
0.2019	18.38	6800	0.2037	0.9155	0.9155
0.211	18.92	7000	0.2019	0.9164	0.9164
0.2065	19.46	7200	0.2086	0.9164	0.9164
0.205	20.0	7400	0.2034	0.9155	0.9155
0.2077	20.54	7600	0.2042	0.9164	0.9164
0.2018	21.08	7800	0.2008	0.9160	0.9160
0.2052	21.62	8000	0.2012	0.9169	0.9169
0.2025	22.16	8200	0.2027	0.9150	0.9150
0.1994	22.7	8400	0.2017	0.9162	0.9162
0.205	23.24	8600	0.2006	0.9171	0.9171
0.2002	23.78	8800	0.2010	0.9155	0.9155
0.2055	24.32	9000	0.2049	0.9162	0.9162
0.1998	24.86	9200	0.2002	0.9172	0.9172
0.2026	25.41	9400	0.2016	0.9154	0.9154
0.2016	25.95	9600	0.2027	0.9159	0.9159
0.2014	26.49	9800	0.2010	0.9162	0.9162
0.2011	27.03	10000	0.2012	0.9162	0.9162

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_all-seqsight_16384_512_34M-L1_f

GUE_prom_prom_300_all-seqsight_16384_512_34M-L1_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results