GUE_tf_4-seqsight_8192_512_30M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_8192_512_30M on the mahdibaghbanzadeh/GUE_tf_4 dataset. It achieves the following results on the evaluation set:

Loss: 1.1072
F1 Score: 0.6985
Accuracy: 0.7

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5938	20.0	200	0.5839	0.6927	0.696
0.4624	40.0	400	0.5618	0.7407	0.741
0.3879	60.0	600	0.5554	0.7666	0.767
0.3327	80.0	800	0.5816	0.7678	0.771
0.2946	100.0	1000	0.5931	0.7744	0.776
0.2647	120.0	1200	0.5808	0.7855	0.787
0.2412	140.0	1400	0.6176	0.7794	0.781
0.2206	160.0	1600	0.6405	0.7669	0.77
0.2049	180.0	1800	0.6688	0.7695	0.772
0.1907	200.0	2000	0.6833	0.7732	0.775
0.1827	220.0	2200	0.6694	0.7772	0.779
0.1707	240.0	2400	0.7068	0.7844	0.786
0.1623	260.0	2600	0.6585	0.7922	0.793
0.1527	280.0	2800	0.7206	0.7775	0.78
0.1459	300.0	3000	0.7293	0.7797	0.782
0.1402	320.0	3200	0.6942	0.7992	0.8
0.1342	340.0	3400	0.7153	0.7863	0.788
0.1307	360.0	3600	0.7720	0.7765	0.779
0.1232	380.0	3800	0.7279	0.7822	0.784
0.1181	400.0	4000	0.7732	0.7808	0.783
0.1138	420.0	4200	0.7846	0.7840	0.786
0.1092	440.0	4400	0.7541	0.7829	0.785
0.1072	460.0	4600	0.7809	0.7938	0.796
0.102	480.0	4800	0.7725	0.7924	0.794
0.0999	500.0	5000	0.7435	0.7949	0.796
0.0964	520.0	5200	0.7584	0.7758	0.778
0.0933	540.0	5400	0.7664	0.7843	0.786
0.0899	560.0	5600	0.8301	0.7762	0.779
0.0883	580.0	5800	0.7747	0.7928	0.794
0.0857	600.0	6000	0.7789	0.7941	0.795
0.0847	620.0	6200	0.7575	0.7899	0.791
0.0822	640.0	6400	0.7835	0.7949	0.796
0.0781	660.0	6600	0.8146	0.7873	0.789
0.0774	680.0	6800	0.8272	0.7817	0.784
0.0749	700.0	7000	0.8346	0.7940	0.795
0.0741	720.0	7200	0.8273	0.7859	0.788
0.0726	740.0	7400	0.8139	0.7902	0.792
0.0712	760.0	7600	0.8389	0.7893	0.791
0.0689	780.0	7800	0.8566	0.7893	0.791
0.0686	800.0	8000	0.8251	0.7977	0.799
0.067	820.0	8200	0.8071	0.7884	0.79
0.0662	840.0	8400	0.8441	0.7874	0.789
0.0646	860.0	8600	0.8219	0.7937	0.795
0.0633	880.0	8800	0.8501	0.7894	0.791
0.0634	900.0	9000	0.8174	0.7862	0.788
0.0628	920.0	9200	0.8389	0.7884	0.79
0.0619	940.0	9400	0.8552	0.7861	0.788
0.0606	960.0	9600	0.8563	0.7891	0.791
0.0617	980.0	9800	0.8554	0.7862	0.788
0.0607	1000.0	10000	0.8497	0.7863	0.788

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_4-seqsight_8192_512_30M-L32_all

GUE_tf_4-seqsight_8192_512_30M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results