GUE_prom_prom_300_notata-seqsight_4096_512_15M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_4096_512_15M on the mahdibaghbanzadeh/GUE_prom_prom_300_notata dataset. It achieves the following results on the evaluation set:

Loss: 0.3061
F1 Score: 0.8809
Accuracy: 0.8809

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.5323	9.52	200	0.4283	0.8033	0.8046
0.4179	19.05	400	0.3918	0.8219	0.8221
0.3857	28.57	600	0.3561	0.8403	0.8404
0.3369	38.1	800	0.3149	0.8638	0.8640
0.3068	47.62	1000	0.3020	0.8749	0.8749
0.2867	57.14	1200	0.2971	0.8762	0.8762
0.2709	66.67	1400	0.2933	0.8761	0.8764
0.2563	76.19	1600	0.2878	0.8813	0.8813
0.2443	85.71	1800	0.2873	0.8843	0.8843
0.2348	95.24	2000	0.2866	0.8852	0.8852
0.2272	104.76	2200	0.2826	0.8843	0.8845
0.2198	114.29	2400	0.2857	0.8875	0.8875
0.2147	123.81	2600	0.2840	0.8871	0.8871
0.2095	133.33	2800	0.2849	0.8846	0.8847
0.2061	142.86	3000	0.2945	0.8866	0.8866
0.2015	152.38	3200	0.2890	0.8873	0.8873
0.1982	161.9	3400	0.2824	0.8911	0.8911
0.196	171.43	3600	0.2815	0.8903	0.8903
0.1937	180.95	3800	0.2912	0.8868	0.8868
0.1912	190.48	4000	0.2884	0.8858	0.8858
0.188	200.0	4200	0.2868	0.8873	0.8873
0.1871	209.52	4400	0.2966	0.8869	0.8869
0.1836	219.05	4600	0.3002	0.8856	0.8856
0.1803	228.57	4800	0.2935	0.8866	0.8866
0.1802	238.1	5000	0.2988	0.8858	0.8858
0.1781	247.62	5200	0.2998	0.8860	0.8860
0.177	257.14	5400	0.2962	0.8898	0.8898
0.1752	266.67	5600	0.2983	0.8877	0.8877
0.1732	276.19	5800	0.2920	0.8869	0.8869
0.1725	285.71	6000	0.2958	0.8879	0.8879
0.1714	295.24	6200	0.3009	0.8879	0.8879
0.1703	304.76	6400	0.2985	0.8866	0.8866
0.169	314.29	6600	0.2975	0.8883	0.8883
0.1675	323.81	6800	0.2965	0.8881	0.8881
0.1671	333.33	7000	0.3114	0.8856	0.8856
0.1653	342.86	7200	0.3036	0.8866	0.8866
0.1651	352.38	7400	0.2980	0.8883	0.8883
0.1639	361.9	7600	0.3052	0.8869	0.8869
0.1629	371.43	7800	0.2982	0.8896	0.8896
0.1624	380.95	8000	0.3036	0.8873	0.8873
0.1616	390.48	8200	0.3030	0.8866	0.8866
0.1614	400.0	8400	0.3024	0.8873	0.8873
0.1603	409.52	8600	0.3034	0.8869	0.8869
0.1596	419.05	8800	0.2998	0.8869	0.8869
0.159	428.57	9000	0.3049	0.8890	0.8890
0.1593	438.1	9200	0.3088	0.8864	0.8864
0.1579	447.62	9400	0.3060	0.8877	0.8877
0.158	457.14	9600	0.3023	0.8875	0.8875
0.1581	466.67	9800	0.3043	0.8871	0.8871
0.1581	476.19	10000	0.3046	0.8875	0.8875

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_prom_prom_300_notata-seqsight_4096_512_15M-L32_all

GUE_prom_prom_300_notata-seqsight_4096_512_15M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results