GUE_prom_prom_300_tata-seqsight_4096_512_15M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_15M on the mahdibaghbanzadeh/GUE_prom_prom_300_tata dataset. It achieves the following results on the evaluation set:

Loss: 1.9825
F1 Score: 0.6573
Accuracy: 0.6574

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5881	66.67	200	0.7620	0.6185	0.6183
0.374	133.33	400	0.9147	0.6573	0.6574
0.272	200.0	600	1.0506	0.6653	0.6656
0.2261	266.67	800	1.1062	0.6469	0.6476
0.2015	333.33	1000	1.1267	0.6414	0.6411
0.1875	400.0	1200	1.1849	0.6363	0.6362
0.1758	466.67	1400	1.1932	0.6455	0.6460
0.1634	533.33	1600	1.2416	0.6440	0.6444
0.1555	600.0	1800	1.3574	0.6488	0.6493
0.1475	666.67	2000	1.4170	0.6350	0.6378
0.1374	733.33	2200	1.4038	0.6441	0.6444
0.1286	800.0	2400	1.4878	0.6465	0.6476
0.1229	866.67	2600	1.4307	0.6577	0.6574
0.1162	933.33	2800	1.5280	0.6427	0.6427
0.1091	1000.0	3000	1.4177	0.6488	0.6493
0.1018	1066.67	3200	1.6755	0.6524	0.6525
0.0973	1133.33	3400	1.5230	0.6463	0.6460
0.0917	1200.0	3600	1.5559	0.6550	0.6558
0.0877	1266.67	3800	1.6510	0.6602	0.6607
0.0819	1333.33	4000	1.6203	0.6586	0.6591
0.0777	1400.0	4200	1.6706	0.6600	0.6607
0.0736	1466.67	4400	1.5861	0.6652	0.6656
0.0698	1533.33	4600	1.6971	0.6623	0.6623
0.0671	1600.0	4800	1.7818	0.6717	0.6721
0.0634	1666.67	5000	1.8030	0.6590	0.6591
0.0615	1733.33	5200	1.7842	0.6615	0.6623
0.0587	1800.0	5400	1.7741	0.6591	0.6607
0.0568	1866.67	5600	1.8269	0.6577	0.6591
0.0551	1933.33	5800	1.8929	0.6661	0.6672
0.0531	2000.0	6000	1.9567	0.6641	0.6639
0.0505	2066.67	6200	1.8462	0.6526	0.6525
0.0494	2133.33	6400	1.8927	0.6600	0.6607
0.0473	2200.0	6600	2.0680	0.6575	0.6574
0.046	2266.67	6800	1.8894	0.6526	0.6525
0.0447	2333.33	7000	1.9051	0.6543	0.6542
0.0444	2400.0	7200	2.1094	0.6511	0.6509
0.0423	2466.67	7400	1.9778	0.6729	0.6737
0.0411	2533.33	7600	1.9854	0.6618	0.6623
0.0407	2600.0	7800	1.9483	0.6687	0.6688
0.04	2666.67	8000	1.9649	0.6575	0.6574
0.039	2733.33	8200	1.9644	0.6606	0.6607
0.0388	2800.0	8400	2.0501	0.6670	0.6672
0.0375	2866.67	8600	2.0106	0.6622	0.6623
0.0368	2933.33	8800	2.0446	0.6586	0.6591
0.0363	3000.0	9000	2.0473	0.6555	0.6558
0.0363	3066.67	9200	2.0159	0.6602	0.6607
0.0358	3133.33	9400	2.0621	0.6618	0.6623
0.0355	3200.0	9600	2.0734	0.6686	0.6688
0.0357	3266.67	9800	2.0886	0.6639	0.6639
0.0358	3333.33	10000	2.0690	0.6606	0.6607

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_tata-seqsight_4096_512_15M-L32_all

GUE_prom_prom_300_tata-seqsight_4096_512_15M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results