GUE_prom_prom_300_tata-seqsight_16384_512_22M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_16384_512_22M on the mahdibaghbanzadeh/GUE_prom_prom_300_tata dataset. It achieves the following results on the evaluation set:

Loss: 0.8354
F1 Score: 0.5971
Accuracy: 0.5971

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.594	66.67	200	0.7979	0.6313	0.6313
0.3758	133.33	400	1.0549	0.6299	0.6297
0.267	200.0	600	1.2311	0.6129	0.6134
0.2214	266.67	800	1.3614	0.6036	0.6036
0.1977	333.33	1000	1.3433	0.6031	0.6052
0.184	400.0	1200	1.3919	0.6169	0.6166
0.1721	466.67	1400	1.4290	0.6038	0.6036
0.1647	533.33	1600	1.4838	0.6071	0.6069
0.155	600.0	1800	1.4811	0.6168	0.6166
0.1459	666.67	2000	1.5943	0.6165	0.6166
0.1422	733.33	2200	1.6284	0.6087	0.6101
0.1319	800.0	2400	1.7008	0.6137	0.6134
0.1237	866.67	2600	1.5816	0.6006	0.6003
0.1161	933.33	2800	1.8001	0.6025	0.6036
0.1101	1000.0	3000	1.7079	0.6068	0.6069
0.1036	1066.67	3200	1.8471	0.6071	0.6085
0.097	1133.33	3400	1.7883	0.6006	0.6003
0.093	1200.0	3600	1.9631	0.6131	0.6134
0.0873	1266.67	3800	1.9510	0.6115	0.6117
0.0842	1333.33	4000	1.8361	0.6099	0.6101
0.0803	1400.0	4200	1.9078	0.6080	0.6085
0.076	1466.67	4400	1.9444	0.6227	0.6232
0.0732	1533.33	4600	1.9880	0.6077	0.6085
0.0688	1600.0	4800	2.1511	0.5987	0.6003
0.067	1666.67	5000	2.1142	0.6097	0.6101
0.0651	1733.33	5200	2.1860	0.6090	0.6101
0.0628	1800.0	5400	2.0372	0.6212	0.6215
0.0606	1866.67	5600	2.2769	0.6128	0.6150
0.0588	1933.33	5800	2.1388	0.6094	0.6101
0.0562	2000.0	6000	2.1657	0.6111	0.6117
0.0548	2066.67	6200	2.0734	0.6165	0.6166
0.0539	2133.33	6400	2.0996	0.6127	0.6134
0.051	2200.0	6600	2.1679	0.6130	0.6134
0.0513	2266.67	6800	2.1512	0.6188	0.6199
0.0489	2333.33	7000	2.1352	0.6129	0.6134
0.0471	2400.0	7200	2.3141	0.6175	0.6183
0.0468	2466.67	7400	2.1969	0.6144	0.6150
0.0448	2533.33	7600	2.2664	0.6144	0.6150
0.0445	2600.0	7800	2.2993	0.6124	0.6134
0.0435	2666.67	8000	2.2378	0.6083	0.6085
0.0439	2733.33	8200	2.1876	0.6081	0.6085
0.0417	2800.0	8400	2.2377	0.6115	0.6117
0.0409	2866.67	8600	2.2993	0.6106	0.6117
0.0412	2933.33	8800	2.2438	0.6130	0.6134
0.04	3000.0	9000	2.2970	0.6104	0.6117
0.0404	3066.67	9200	2.3617	0.6174	0.6183
0.0392	3133.33	9400	2.2748	0.6161	0.6166
0.0394	3200.0	9600	2.3875	0.6168	0.6183
0.0382	3266.67	9800	2.3591	0.6156	0.6166
0.0381	3333.33	10000	2.3524	0.6156	0.6166

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_tata-seqsight_16384_512_22M-L32_all

GUE_prom_prom_300_tata-seqsight_16384_512_22M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results