GUE_prom_prom_core_all-seqsight_32768_512_43M-L32_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_32768_512_43M on the mahdibaghbanzadeh/GUE_prom_prom_core_all dataset. It achieves the following results on the evaluation set:

Loss: 0.4103
F1 Score: 0.8197
Accuracy: 0.8198

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5026	0.54	200	0.4479	0.7875	0.7875
0.449	1.08	400	0.4580	0.7867	0.7877
0.4297	1.62	600	0.4411	0.7984	0.7986
0.426	2.16	800	0.4462	0.7910	0.7917
0.4232	2.7	1000	0.4405	0.7927	0.7936
0.4197	3.24	1200	0.4318	0.7966	0.7968
0.4174	3.78	1400	0.4356	0.7940	0.7949
0.4093	4.32	1600	0.4287	0.8042	0.8044
0.4096	4.86	1800	0.4404	0.7958	0.7968
0.4051	5.41	2000	0.4395	0.8003	0.8008
0.4044	5.95	2200	0.4295	0.8078	0.8078
0.4058	6.49	2400	0.4268	0.8018	0.8020
0.3957	7.03	2600	0.4296	0.8042	0.8046
0.3973	7.57	2800	0.4234	0.8103	0.8103
0.391	8.11	3000	0.4288	0.8009	0.8014
0.388	8.65	3200	0.4257	0.8052	0.8056
0.3915	9.19	3400	0.4285	0.8118	0.8118
0.3847	9.73	3600	0.4270	0.8072	0.8076
0.3847	10.27	3800	0.4315	0.8075	0.8078
0.3808	10.81	4000	0.4313	0.8074	0.8074
0.3807	11.35	4200	0.4233	0.8109	0.8110
0.3766	11.89	4400	0.4281	0.8074	0.8079
0.3747	12.43	4600	0.4246	0.8123	0.8123
0.3714	12.97	4800	0.4189	0.8113	0.8113
0.3704	13.51	5000	0.4359	0.7986	0.7997
0.3667	14.05	5200	0.4249	0.8138	0.8139
0.3629	14.59	5400	0.4267	0.8084	0.8088
0.3669	15.14	5600	0.4253	0.8127	0.8127
0.3618	15.68	5800	0.4347	0.8073	0.8078
0.3594	16.22	6000	0.4221	0.8115	0.8118
0.3635	16.76	6200	0.4173	0.8116	0.8120
0.3563	17.3	6400	0.4254	0.8115	0.8118
0.3603	17.84	6600	0.4281	0.8106	0.8106
0.3543	18.38	6800	0.4375	0.8052	0.8063
0.3544	18.92	7000	0.4178	0.8130	0.8133
0.3453	19.46	7200	0.4283	0.8138	0.8142
0.3564	20.0	7400	0.4204	0.8143	0.8145
0.3529	20.54	7600	0.4193	0.8119	0.8122
0.3467	21.08	7800	0.4191	0.8180	0.8181
0.3499	21.62	8000	0.4145	0.8144	0.8145
0.3477	22.16	8200	0.4239	0.8143	0.8145
0.3516	22.7	8400	0.4229	0.8089	0.8095
0.3441	23.24	8600	0.4179	0.8138	0.8140
0.3449	23.78	8800	0.4209	0.8130	0.8133
0.3392	24.32	9000	0.4206	0.8167	0.8169
0.3438	24.86	9200	0.4191	0.8147	0.8149
0.3483	25.41	9400	0.4207	0.8132	0.8133
0.3371	25.95	9600	0.4216	0.8152	0.8154
0.3425	26.49	9800	0.4232	0.8138	0.8140
0.3381	27.03	10000	0.4236	0.8148	0.8150

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_core_all-seqsight_32768_512_43M-L32_f

GUE_prom_prom_core_all-seqsight_32768_512_43M-L32_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results