GUE_prom_prom_300_all-seqsight_16384_512_22M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_16384_512_22M on the mahdibaghbanzadeh/GUE_prom_prom_300_all dataset. It achieves the following results on the evaluation set:

Loss: 0.4221
F1 Score: 0.8280
Accuracy: 0.8280

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 1536
eval_batch_size: 1536
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5993	6.45	200	0.5279	0.7420	0.7424
0.511	12.9	400	0.5012	0.7558	0.7568
0.4874	19.35	600	0.4878	0.7699	0.7703
0.4675	25.81	800	0.4767	0.7743	0.7752
0.4472	32.26	1000	0.4560	0.7845	0.7850
0.4209	38.71	1200	0.4419	0.7936	0.7943
0.4041	45.16	1400	0.4286	0.8064	0.8064
0.391	51.61	1600	0.4183	0.8108	0.8108
0.3826	58.06	1800	0.4144	0.8116	0.8117
0.3731	64.52	2000	0.4179	0.8139	0.8140
0.3664	70.97	2200	0.4126	0.8133	0.8135
0.36	77.42	2400	0.4184	0.8099	0.8103
0.3538	83.87	2600	0.4093	0.8168	0.8169
0.3482	90.32	2800	0.4159	0.8165	0.8166
0.3418	96.77	3000	0.4082	0.8214	0.8215
0.3369	103.23	3200	0.4192	0.8204	0.8206
0.3321	109.68	3400	0.4123	0.8200	0.8203
0.3266	116.13	3600	0.4095	0.8210	0.8211
0.3241	122.58	3800	0.4094	0.8224	0.8225
0.3213	129.03	4000	0.4024	0.8233	0.8235
0.3168	135.48	4200	0.4072	0.8249	0.825
0.3121	141.94	4400	0.4084	0.8259	0.8260
0.3107	148.39	4600	0.4125	0.8266	0.8267
0.3074	154.84	4800	0.4168	0.8231	0.8233
0.3051	161.29	5000	0.4144	0.8260	0.8262
0.3034	167.74	5200	0.4244	0.8241	0.8243
0.2992	174.19	5400	0.4163	0.8295	0.8296
0.2985	180.65	5600	0.4101	0.8268	0.8269
0.2959	187.1	5800	0.4233	0.8252	0.8253
0.2944	193.55	6000	0.4147	0.8268	0.8269
0.2926	200.0	6200	0.4145	0.8309	0.8309
0.2907	206.45	6400	0.4186	0.8252	0.8253
0.2891	212.9	6600	0.4275	0.8265	0.8267
0.288	219.35	6800	0.4174	0.8264	0.8265
0.2861	225.81	7000	0.4149	0.8270	0.8270
0.2833	232.26	7200	0.4089	0.8287	0.8287
0.2842	238.71	7400	0.4158	0.8267	0.8267
0.2828	245.16	7600	0.4135	0.8286	0.8287
0.2819	251.61	7800	0.4157	0.8272	0.8272
0.2797	258.06	8000	0.4160	0.8296	0.8296
0.2785	264.52	8200	0.4180	0.8249	0.825
0.2785	270.97	8400	0.4247	0.8269	0.8270
0.278	277.42	8600	0.4147	0.8271	0.8272
0.2767	283.87	8800	0.4157	0.8261	0.8262
0.2769	290.32	9000	0.4172	0.8249	0.825
0.2757	296.77	9200	0.4173	0.8258	0.8258
0.2763	303.23	9400	0.4180	0.8259	0.8260
0.2755	309.68	9600	0.4202	0.8269	0.8270
0.2764	316.13	9800	0.4165	0.8258	0.8258
0.2741	322.58	10000	0.4192	0.8258	0.8258

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_all-seqsight_16384_512_22M-L32_all

GUE_prom_prom_300_all-seqsight_16384_512_22M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results