GUE_tf_3-seqsight_65536_512_47M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_47M on the mahdibaghbanzadeh/GUE_tf_3 dataset. It achieves the following results on the evaluation set:

Loss: 0.6970
F1 Score: 0.5785
Accuracy: 0.58

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.6775	14.29	200	0.6530	0.5861	0.599
0.6521	28.57	400	0.6515	0.5891	0.593
0.6332	42.86	600	0.6520	0.5893	0.601
0.6109	57.14	800	0.6521	0.6138	0.618
0.585	71.43	1000	0.6535	0.6204	0.621
0.5689	85.71	1200	0.6656	0.6290	0.629
0.5582	100.0	1400	0.6678	0.6236	0.624
0.5492	114.29	1600	0.6860	0.6127	0.616
0.5434	128.57	1800	0.6744	0.6112	0.612
0.5358	142.86	2000	0.6770	0.6166	0.617
0.528	157.14	2200	0.6876	0.6151	0.615
0.5226	171.43	2400	0.7186	0.6139	0.614
0.5156	185.71	2600	0.7043	0.6091	0.61
0.5071	200.0	2800	0.7230	0.6170	0.62
0.502	214.29	3000	0.7309	0.6030	0.603
0.4934	228.57	3200	0.7531	0.6029	0.604
0.4861	242.86	3400	0.7478	0.6089	0.609
0.4796	257.14	3600	0.7654	0.6181	0.618
0.4725	271.43	3800	0.7692	0.6159	0.616
0.4676	285.71	4000	0.7616	0.6001	0.6
0.4604	300.0	4200	0.7514	0.6021	0.602
0.4534	314.29	4400	0.7611	0.6120	0.612
0.4481	328.57	4600	0.7757	0.6117	0.612
0.4428	342.86	4800	0.7963	0.6021	0.602
0.4388	357.14	5000	0.8140	0.6110	0.611
0.4297	371.43	5200	0.8055	0.6081	0.608
0.4241	385.71	5400	0.8102	0.6159	0.616
0.4198	400.0	5600	0.8355	0.6021	0.602
0.4142	414.29	5800	0.8202	0.6120	0.612
0.4114	428.57	6000	0.8378	0.6069	0.607
0.4076	442.86	6200	0.8493	0.5916	0.593
0.4017	457.14	6400	0.8281	0.6123	0.613
0.3977	471.43	6600	0.8478	0.5999	0.6
0.3934	485.71	6800	0.8371	0.6145	0.615
0.3897	500.0	7000	0.8405	0.6051	0.605
0.3863	514.29	7200	0.8297	0.6081	0.608
0.3829	528.57	7400	0.8615	0.6051	0.605
0.3795	542.86	7600	0.8482	0.6041	0.604
0.3775	557.14	7800	0.8614	0.6101	0.61
0.3733	571.43	8000	0.8678	0.6081	0.608
0.3708	585.71	8200	0.8759	0.6101	0.61
0.3697	600.0	8400	0.8474	0.6140	0.614
0.3678	614.29	8600	0.8764	0.5986	0.599
0.3661	628.57	8800	0.8847	0.6071	0.607
0.363	642.86	9000	0.8804	0.6151	0.615
0.3619	657.14	9200	0.8750	0.6131	0.613
0.3616	671.43	9400	0.8799	0.6101	0.61
0.3588	685.71	9600	0.8777	0.6061	0.606
0.3598	700.0	9800	0.8793	0.6010	0.601
0.3598	714.29	10000	0.8761	0.6041	0.604

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_3-seqsight_65536_512_47M-L32_all

GUE_tf_3-seqsight_65536_512_47M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results