GUE_tf_0-seqsight_8192_512_30M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_8192_512_30M on the mahdibaghbanzadeh/GUE_tf_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.6934
F1 Score: 0.7132
Accuracy: 0.715

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.6225	12.5	200	0.5892	0.6971	0.697
0.5357	25.0	400	0.5881	0.7101	0.71
0.4937	37.5	600	0.5850	0.7244	0.726
0.4603	50.0	800	0.6049	0.7187	0.72
0.4325	62.5	1000	0.6313	0.7047	0.705
0.4105	75.0	1200	0.6312	0.7209	0.721
0.3917	87.5	1400	0.6381	0.7121	0.713
0.3745	100.0	1600	0.6879	0.7190	0.719
0.3607	112.5	1800	0.6741	0.7225	0.724
0.3507	125.0	2000	0.6616	0.7256	0.726
0.3407	137.5	2200	0.6852	0.7266	0.727
0.329	150.0	2400	0.7090	0.7287	0.73
0.3201	162.5	2600	0.6944	0.7197	0.721
0.3093	175.0	2800	0.7109	0.7220	0.722
0.2984	187.5	3000	0.7240	0.7199	0.72
0.292	200.0	3200	0.7457	0.7209	0.721
0.2815	212.5	3400	0.7469	0.7159	0.716
0.2739	225.0	3600	0.7821	0.7110	0.711
0.2661	237.5	3800	0.7747	0.7100	0.71
0.2595	250.0	4000	0.7560	0.7100	0.71
0.2501	262.5	4200	0.7846	0.7109	0.711
0.2449	275.0	4400	0.7904	0.7110	0.711
0.2367	287.5	4600	0.7928	0.7116	0.712
0.2316	300.0	4800	0.8287	0.7093	0.71
0.2255	312.5	5000	0.8437	0.7106	0.711
0.2203	325.0	5200	0.8609	0.7096	0.71
0.2139	337.5	5400	0.8534	0.7067	0.707
0.2089	350.0	5600	0.8720	0.7120	0.712
0.2056	362.5	5800	0.8517	0.7091	0.709
0.1984	375.0	6000	0.8594	0.702	0.702
0.1969	387.5	6200	0.8928	0.7020	0.702
0.1917	400.0	6400	0.8901	0.7114	0.712
0.1882	412.5	6600	0.8833	0.7109	0.711
0.1848	425.0	6800	0.8861	0.6970	0.697
0.1803	437.5	7000	0.9046	0.7029	0.703
0.1772	450.0	7200	0.9143	0.6994	0.7
0.1751	462.5	7400	0.9243	0.6967	0.697
0.1732	475.0	7600	0.9390	0.7069	0.707
0.1699	487.5	7800	0.9518	0.7080	0.708
0.1662	500.0	8000	0.9361	0.7070	0.707
0.1659	512.5	8200	0.9330	0.6999	0.7
0.163	525.0	8400	0.9480	0.6989	0.699
0.1613	537.5	8600	0.9420	0.7050	0.705
0.1611	550.0	8800	0.9542	0.7070	0.707
0.1582	562.5	9000	0.9505	0.6958	0.696
0.157	575.0	9200	0.9491	0.7019	0.702
0.1555	587.5	9400	0.9579	0.7018	0.702
0.1554	600.0	9600	0.9698	0.6977	0.698
0.1548	612.5	9800	0.9704	0.6978	0.698
0.1543	625.0	10000	0.9668	0.6968	0.697

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_0-seqsight_8192_512_30M-L32_all

GUE_tf_0-seqsight_8192_512_30M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results