GUE_prom_prom_core_all-seqsight_4096_512_27M-L8_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_27M on the mahdibaghbanzadeh/GUE_prom_prom_core_all dataset. It achieves the following results on the evaluation set:

Loss: 0.4057
F1 Score: 0.8135
Accuracy: 0.8137

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5111	0.54	200	0.4623	0.7827	0.7828
0.4532	1.08	400	0.4563	0.7913	0.7916
0.4358	1.62	600	0.4413	0.7949	0.7949
0.4289	2.16	800	0.4435	0.7948	0.7951
0.4251	2.7	1000	0.4364	0.7980	0.7981
0.4242	3.24	1200	0.4312	0.7990	0.7990
0.4202	3.78	1400	0.4326	0.8022	0.8024
0.4104	4.32	1600	0.4300	0.8044	0.8044
0.4156	4.86	1800	0.4318	0.8021	0.8022
0.414	5.41	2000	0.4270	0.8057	0.8057
0.4105	5.95	2200	0.4289	0.8042	0.8042
0.4127	6.49	2400	0.4269	0.8049	0.8049
0.4054	7.03	2600	0.4302	0.8003	0.8005
0.4056	7.57	2800	0.4284	0.8052	0.8052
0.3989	8.11	3000	0.4282	0.8022	0.8024
0.3991	8.65	3200	0.4223	0.8084	0.8084
0.4032	9.19	3400	0.4259	0.8056	0.8056
0.3989	9.73	3600	0.4270	0.8056	0.8059
0.4032	10.27	3800	0.4242	0.8063	0.8064
0.3962	10.81	4000	0.4330	0.8023	0.8025
0.3967	11.35	4200	0.4260	0.8047	0.8047
0.3943	11.89	4400	0.4209	0.8074	0.8076
0.395	12.43	4600	0.4256	0.8027	0.8029
0.3926	12.97	4800	0.4204	0.8057	0.8057
0.3915	13.51	5000	0.4242	0.8039	0.8042
0.3892	14.05	5200	0.4224	0.8068	0.8068
0.3872	14.59	5400	0.4224	0.8078	0.8078
0.3911	15.14	5600	0.4237	0.8055	0.8056
0.388	15.68	5800	0.4240	0.8068	0.8071
0.3837	16.22	6000	0.4212	0.8058	0.8059
0.3872	16.76	6200	0.4185	0.8084	0.8084
0.3894	17.3	6400	0.4171	0.8057	0.8057
0.3832	17.84	6600	0.4202	0.8068	0.8068
0.3817	18.38	6800	0.4240	0.8071	0.8074
0.3824	18.92	7000	0.4159	0.8059	0.8059
0.3768	19.46	7200	0.4198	0.8062	0.8063
0.3883	20.0	7400	0.4204	0.8059	0.8059
0.3796	20.54	7600	0.4196	0.8076	0.8076
0.3825	21.08	7800	0.4205	0.8074	0.8074
0.3811	21.62	8000	0.4194	0.8037	0.8037
0.379	22.16	8200	0.4171	0.8077	0.8078
0.385	22.7	8400	0.4169	0.8101	0.8101
0.3771	23.24	8600	0.4182	0.8032	0.8032
0.3759	23.78	8800	0.4191	0.8084	0.8084
0.3766	24.32	9000	0.4184	0.8076	0.8076
0.3776	24.86	9200	0.4181	0.8056	0.8056
0.3806	25.41	9400	0.4177	0.8064	0.8064
0.3726	25.95	9600	0.4186	0.8066	0.8066
0.3789	26.49	9800	0.4186	0.8072	0.8073
0.3735	27.03	10000	0.4188	0.8073	0.8073

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_core_all-seqsight_4096_512_27M-L8_f

GUE_prom_prom_core_all-seqsight_4096_512_27M-L8_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results