GUE_tf_1-seqsight_8192_512_30M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_8192_512_30M on the mahdibaghbanzadeh/GUE_tf_1 dataset. It achieves the following results on the evaluation set:

Loss: 0.5174
F1 Score: 0.7418
Accuracy: 0.746

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.6225	13.33	200	0.5973	0.6770	0.677
0.5426	26.67	400	0.6050	0.6616	0.663
0.5017	40.0	600	0.6217	0.6740	0.674
0.4668	53.33	800	0.6290	0.6902	0.692
0.4393	66.67	1000	0.6491	0.6888	0.689
0.4151	80.0	1200	0.6627	0.6889	0.689
0.3961	93.33	1400	0.6513	0.6840	0.684
0.3797	106.67	1600	0.6851	0.6879	0.688
0.3656	120.0	1800	0.7099	0.6855	0.686
0.3537	133.33	2000	0.7395	0.6800	0.68
0.3408	146.67	2200	0.7374	0.6830	0.683
0.3307	160.0	2400	0.7293	0.6840	0.684
0.3191	173.33	2600	0.7739	0.6810	0.681
0.3083	186.67	2800	0.7673	0.6770	0.677
0.2991	200.0	3000	0.8049	0.6789	0.679
0.289	213.33	3200	0.7730	0.6768	0.677
0.2784	226.67	3400	0.8322	0.6779	0.678
0.2716	240.0	3600	0.8422	0.6690	0.67
0.262	253.33	3800	0.8461	0.6730	0.673
0.2521	266.67	4000	0.8696	0.6776	0.678
0.2461	280.0	4200	0.8740	0.6739	0.674
0.2383	293.33	4400	0.9173	0.6850	0.685
0.2307	306.67	4600	0.9165	0.6779	0.678
0.2255	320.0	4800	0.9309	0.6857	0.686
0.2192	333.33	5000	0.9353	0.6709	0.671
0.2138	346.67	5200	0.9088	0.6780	0.678
0.2083	360.0	5400	0.9699	0.6704	0.671
0.2018	373.33	5600	0.9811	0.6769	0.677
0.1975	386.67	5800	0.9467	0.6687	0.669
0.1925	400.0	6000	0.9813	0.6755	0.676
0.1886	413.33	6200	0.9830	0.6779	0.678
0.184	426.67	6400	0.9905	0.6770	0.677
0.1806	440.0	6600	1.0004	0.6721	0.673
0.1771	453.33	6800	1.0257	0.6809	0.681
0.1726	466.67	7000	1.0673	0.6677	0.668
0.1702	480.0	7200	1.0637	0.6689	0.669
0.1674	493.33	7400	1.0590	0.6670	0.667
0.1655	506.67	7600	1.0730	0.6680	0.668
0.1629	520.0	7800	1.0953	0.6730	0.673
0.1594	533.33	8000	1.0809	0.6679	0.668
0.1588	546.67	8200	1.0749	0.6650	0.665
0.1565	560.0	8400	1.0858	0.6709	0.671
0.1543	573.33	8600	1.1003	0.6650	0.665
0.1528	586.67	8800	1.0985	0.6680	0.668
0.1504	600.0	9000	1.1135	0.6670	0.667
0.1502	613.33	9200	1.1064	0.6669	0.667
0.1491	626.67	9400	1.1020	0.6678	0.668
0.1492	640.0	9600	1.1107	0.6670	0.667
0.1482	653.33	9800	1.1083	0.6690	0.669
0.1475	666.67	10000	1.1123	0.6630	0.663

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_1-seqsight_8192_512_30M-L32_all

GUE_tf_1-seqsight_8192_512_30M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results