GUE_tf_2-seqsight_65536_512_94M-L32_f

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_94M on the mahdibaghbanzadeh/GUE_tf_2 dataset. It achieves the following results on the evaluation set:

Loss: 0.4680
F1 Score: 0.7900
Accuracy: 0.79

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5454	1.34	200	0.5144	0.7380	0.741
0.5065	2.68	400	0.5020	0.7565	0.757
0.4949	4.03	600	0.4917	0.7578	0.758
0.4828	5.37	800	0.4832	0.7607	0.761
0.4735	6.71	1000	0.4968	0.7522	0.753
0.4617	8.05	1200	0.4836	0.7536	0.754
0.4542	9.4	1400	0.4906	0.7650	0.765
0.4451	10.74	1600	0.4919	0.7610	0.761
0.4323	12.08	1800	0.4952	0.7489	0.749
0.4226	13.42	2000	0.5073	0.7549	0.755
0.4153	14.77	2200	0.4973	0.7559	0.756
0.403	16.11	2400	0.5103	0.7520	0.752
0.3962	17.45	2600	0.5157	0.7539	0.754
0.3769	18.79	2800	0.5220	0.7448	0.745
0.3743	20.13	3000	0.5062	0.766	0.766
0.3642	21.48	3200	0.5939	0.7403	0.741
0.3538	22.82	3400	0.5488	0.7620	0.762
0.3433	24.16	3600	0.5599	0.7480	0.748
0.3368	25.5	3800	0.5611	0.7509	0.751
0.3299	26.85	4000	0.5910	0.7467	0.747
0.3192	28.19	4200	0.6363	0.7303	0.732
0.3104	29.53	4400	0.6327	0.7425	0.743
0.3026	30.87	4600	0.6015	0.7408	0.741
0.2956	32.21	4800	0.6333	0.7367	0.737
0.287	33.56	5000	0.6330	0.7427	0.743
0.2834	34.9	5200	0.6429	0.7466	0.747
0.2729	36.24	5400	0.6588	0.7378	0.738
0.2728	37.58	5600	0.6616	0.7349	0.735
0.264	38.93	5800	0.6898	0.7360	0.737
0.2548	40.27	6000	0.6694	0.7400	0.74
0.2557	41.61	6200	0.6610	0.7490	0.749
0.2497	42.95	6400	0.6903	0.7379	0.738
0.2403	44.3	6600	0.7028	0.7370	0.737
0.2425	45.64	6800	0.7037	0.7369	0.737
0.2361	46.98	7000	0.7137	0.7335	0.734
0.227	48.32	7200	0.7559	0.7354	0.736
0.2231	49.66	7400	0.7477	0.7376	0.738
0.222	51.01	7600	0.7459	0.7306	0.731
0.217	52.35	7800	0.7566	0.7427	0.743
0.2249	53.69	8000	0.7300	0.7337	0.734
0.2099	55.03	8200	0.7541	0.7347	0.735
0.2108	56.38	8400	0.7720	0.7358	0.736
0.205	57.72	8600	0.7856	0.7379	0.738
0.209	59.06	8800	0.7668	0.7377	0.738
0.2001	60.4	9000	0.7753	0.7389	0.739
0.2036	61.74	9200	0.7793	0.7367	0.737
0.2014	63.09	9400	0.7907	0.7337	0.734
0.2014	64.43	9600	0.7843	0.7369	0.737
0.1962	65.77	9800	0.7948	0.7318	0.732
0.199	67.11	10000	0.7940	0.7358	0.736

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_2-seqsight_65536_512_94M-L32_f

GUE_tf_2-seqsight_65536_512_94M-L32_f

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results