GUE_tf_4-seqsight_65536_512_47M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_47M on the mahdibaghbanzadeh/GUE_tf_4 dataset. It achieves the following results on the evaluation set:

Loss: 1.0826
F1 Score: 0.6434
Accuracy: 0.647

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.6429	20.0	200	0.6242	0.6495	0.65
0.5598	40.0	400	0.5938	0.6990	0.699
0.5	60.0	600	0.5678	0.7125	0.715
0.4497	80.0	800	0.5621	0.7270	0.727
0.4262	100.0	1000	0.5559	0.7432	0.744
0.4118	120.0	1200	0.5539	0.7485	0.75
0.3982	140.0	1400	0.5559	0.7379	0.738
0.39	160.0	1600	0.5506	0.7376	0.738
0.3813	180.0	1800	0.5543	0.7500	0.751
0.3721	200.0	2000	0.5692	0.7418	0.742
0.3627	220.0	2200	0.5774	0.7394	0.741
0.3552	240.0	2400	0.5622	0.7492	0.75
0.3475	260.0	2600	0.5459	0.7529	0.753
0.3372	280.0	2800	0.5509	0.7562	0.757
0.3274	300.0	3000	0.5506	0.7618	0.762
0.3182	320.0	3200	0.5787	0.7554	0.758
0.3076	340.0	3400	0.5501	0.7782	0.779
0.2999	360.0	3600	0.5493	0.7640	0.766
0.2889	380.0	3800	0.5461	0.7793	0.78
0.2791	400.0	4000	0.5430	0.7828	0.783
0.2711	420.0	4200	0.5613	0.7844	0.786
0.2613	440.0	4400	0.5767	0.7811	0.783
0.2525	460.0	4600	0.5546	0.7789	0.781
0.2441	480.0	4800	0.5489	0.7917	0.793
0.2355	500.0	5000	0.5749	0.7831	0.785
0.2295	520.0	5200	0.5618	0.7925	0.794
0.2219	540.0	5400	0.5502	0.8067	0.807
0.2162	560.0	5600	0.5644	0.7957	0.797
0.2106	580.0	5800	0.5789	0.8058	0.807
0.2077	600.0	6000	0.5623	0.8074	0.808
0.1995	620.0	6200	0.5720	0.8083	0.809
0.1954	640.0	6400	0.5754	0.8072	0.808
0.1907	660.0	6600	0.5907	0.8071	0.808
0.1859	680.0	6800	0.5828	0.8091	0.81
0.183	700.0	7000	0.5844	0.8153	0.816
0.1777	720.0	7200	0.5739	0.8196	0.82
0.1752	740.0	7400	0.6080	0.8060	0.807
0.1738	760.0	7600	0.6083	0.8036	0.805
0.1711	780.0	7800	0.6113	0.8121	0.813
0.1684	800.0	8000	0.6043	0.8120	0.813
0.1669	820.0	8200	0.6051	0.8112	0.812
0.164	840.0	8400	0.6015	0.8133	0.814
0.1612	860.0	8600	0.6188	0.8124	0.813
0.1595	880.0	8800	0.6013	0.8123	0.813
0.1576	900.0	9000	0.5933	0.8164	0.817
0.1579	920.0	9200	0.6078	0.8081	0.809
0.1551	940.0	9400	0.6100	0.8132	0.814
0.1543	960.0	9600	0.6119	0.8111	0.812
0.1545	980.0	9800	0.6110	0.8112	0.812
0.1536	1000.0	10000	0.6102	0.8122	0.813

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_4-seqsight_65536_512_47M-L32_all

GUE_tf_4-seqsight_65536_512_47M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results