GUE_prom_prom_core_all-seqsight_4096_512_27M-L1_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_core_all dataset. It achieves the following results on the evaluation set:

Loss: 0.4142
F1 Score: 0.8106
Accuracy: 0.8106

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.526	0.54	200	0.4690	0.7800	0.7801
0.4642	1.08	400	0.4573	0.7915	0.7916
0.4496	1.62	600	0.4511	0.7932	0.7934
0.4429	2.16	800	0.4471	0.7951	0.7951
0.4402	2.7	1000	0.4441	0.7963	0.7963
0.4391	3.24	1200	0.4393	0.8002	0.8002
0.4343	3.78	1400	0.4412	0.7965	0.7966
0.4241	4.32	1600	0.4400	0.8011	0.8012
0.4291	4.86	1800	0.4398	0.7979	0.7980
0.4276	5.41	2000	0.4354	0.7978	0.7978
0.424	5.95	2200	0.4369	0.7990	0.7990
0.4281	6.49	2400	0.4354	0.7985	0.7985
0.4189	7.03	2600	0.4380	0.7961	0.7963
0.4221	7.57	2800	0.4347	0.7988	0.7988
0.4136	8.11	3000	0.4358	0.8008	0.8008
0.4154	8.65	3200	0.4325	0.7986	0.7986
0.4181	9.19	3400	0.4356	0.7981	0.7981
0.4159	9.73	3600	0.4349	0.8009	0.8012
0.4191	10.27	3800	0.4318	0.8023	0.8024
0.4132	10.81	4000	0.4376	0.7992	0.7993
0.4148	11.35	4200	0.4317	0.8012	0.8012
0.4124	11.89	4400	0.4291	0.8024	0.8025
0.4146	12.43	4600	0.4318	0.8000	0.8002
0.4097	12.97	4800	0.4291	0.8022	0.8022
0.4106	13.51	5000	0.4318	0.8011	0.8014
0.4095	14.05	5200	0.4289	0.8024	0.8024
0.4087	14.59	5400	0.4328	0.8021	0.8022
0.4117	15.14	5600	0.4330	0.7998	0.8
0.4105	15.68	5800	0.4303	0.8014	0.8015
0.405	16.22	6000	0.4285	0.8025	0.8025
0.4105	16.76	6200	0.4261	0.8032	0.8032
0.4131	17.3	6400	0.4255	0.8049	0.8049
0.4056	17.84	6600	0.4276	0.8046	0.8046
0.4051	18.38	6800	0.4289	0.8036	0.8037
0.4058	18.92	7000	0.4252	0.8046	0.8046
0.4007	19.46	7200	0.4286	0.8044	0.8044
0.4118	20.0	7400	0.4276	0.8034	0.8034
0.405	20.54	7600	0.4270	0.8057	0.8057
0.4052	21.08	7800	0.4273	0.8049	0.8049
0.405	21.62	8000	0.4278	0.8035	0.8035
0.4043	22.16	8200	0.4247	0.8056	0.8056
0.4099	22.7	8400	0.4241	0.8049	0.8049
0.4027	23.24	8600	0.4262	0.8035	0.8035
0.4025	23.78	8800	0.4265	0.8042	0.8042
0.4015	24.32	9000	0.4264	0.8041	0.8041
0.4043	24.86	9200	0.4259	0.8039	0.8039
0.4081	25.41	9400	0.4255	0.8056	0.8056
0.3981	25.95	9600	0.4261	0.8054	0.8054
0.4064	26.49	9800	0.4258	0.8054	0.8054
0.4008	27.03	10000	0.4259	0.8051	0.8051

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_core_all-seqsight_4096_512_27M-L1_f

GUE_prom_prom_core_all-seqsight_4096_512_27M-L1_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results