GUE_prom_prom_core_all-seqsight_4096_512_27M-L32_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_core_all dataset. It achieves the following results on the evaluation set:

Loss: 0.4067
F1 Score: 0.8216
Accuracy: 0.8218

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.4988	0.54	200	0.4541	0.7913	0.7914
0.4434	1.08	400	0.4485	0.7958	0.7961
0.4252	1.62	600	0.4374	0.7988	0.7988
0.4208	2.16	800	0.4346	0.8000	0.8002
0.416	2.7	1000	0.4302	0.8021	0.8022
0.4141	3.24	1200	0.4253	0.8041	0.8041
0.4103	3.78	1400	0.4222	0.8055	0.8056
0.4005	4.32	1600	0.4234	0.8054	0.8054
0.404	4.86	1800	0.4244	0.8075	0.8076
0.4004	5.41	2000	0.4203	0.8028	0.8029
0.3977	5.95	2200	0.4255	0.8061	0.8061
0.3971	6.49	2400	0.4217	0.8037	0.8037
0.3892	7.03	2600	0.4223	0.8081	0.8081
0.3874	7.57	2800	0.4260	0.8061	0.8061
0.3806	8.11	3000	0.4252	0.8070	0.8071
0.3796	8.65	3200	0.4160	0.8090	0.8091
0.382	9.19	3400	0.4239	0.8096	0.8096
0.3781	9.73	3600	0.4217	0.8109	0.8111
0.3795	10.27	3800	0.4218	0.8112	0.8113
0.3724	10.81	4000	0.4285	0.8089	0.8091
0.3686	11.35	4200	0.4226	0.8143	0.8144
0.3692	11.89	4400	0.4139	0.8138	0.8139
0.3656	12.43	4600	0.4227	0.8119	0.8120
0.3648	12.97	4800	0.4143	0.8162	0.8162
0.3598	13.51	5000	0.4204	0.8105	0.8108
0.3591	14.05	5200	0.4187	0.8164	0.8164
0.3541	14.59	5400	0.4187	0.8169	0.8169
0.3585	15.14	5600	0.4201	0.8159	0.8159
0.352	15.68	5800	0.4253	0.8111	0.8113
0.3495	16.22	6000	0.4192	0.8113	0.8115
0.3493	16.76	6200	0.4150	0.8179	0.8179
0.3496	17.3	6400	0.4133	0.8192	0.8193
0.3474	17.84	6600	0.4183	0.8140	0.8140
0.3408	18.38	6800	0.4223	0.8123	0.8127
0.3439	18.92	7000	0.4128	0.8170	0.8171
0.3338	19.46	7200	0.4213	0.8189	0.8189
0.3459	20.0	7400	0.4187	0.8181	0.8181
0.3376	20.54	7600	0.4184	0.8193	0.8194
0.3392	21.08	7800	0.4212	0.8176	0.8176
0.3369	21.62	8000	0.4178	0.8152	0.8152
0.3335	22.16	8200	0.4184	0.8158	0.8159
0.3384	22.7	8400	0.4173	0.8156	0.8157
0.3314	23.24	8600	0.4185	0.8159	0.8159
0.3303	23.78	8800	0.4201	0.8157	0.8157
0.3288	24.32	9000	0.4197	0.8164	0.8164
0.3298	24.86	9200	0.4201	0.8165	0.8166
0.3298	25.41	9400	0.4208	0.8157	0.8157
0.3258	25.95	9600	0.4219	0.8169	0.8169
0.329	26.49	9800	0.4219	0.8162	0.8162
0.3261	27.03	10000	0.4214	0.8176	0.8176

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_core_all-seqsight_4096_512_27M-L32_f

GUE_prom_prom_core_all-seqsight_4096_512_27M-L32_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results