GUE_tf_1-seqsight_65536_512_47M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_47M on the mahdibaghbanzadeh/GUE_tf_1 dataset. It achieves the following results on the evaluation set:

Loss: 0.5477
F1 Score: 0.7309
Accuracy: 0.731

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.6592	13.33	200	0.6495	0.6033	0.607
0.6107	26.67	400	0.6363	0.6399	0.64
0.5799	40.0	600	0.6327	0.6369	0.637
0.5506	53.33	800	0.6434	0.6484	0.654
0.5313	66.67	1000	0.6420	0.6538	0.654
0.5205	80.0	1200	0.6391	0.6540	0.654
0.5121	93.33	1400	0.6552	0.6500	0.65
0.506	106.67	1600	0.6545	0.6498	0.65
0.5004	120.0	1800	0.6450	0.6490	0.649
0.4956	133.33	2000	0.6594	0.6573	0.658
0.4913	146.67	2200	0.6655	0.6543	0.655
0.4853	160.0	2400	0.6853	0.6550	0.655
0.4795	173.33	2600	0.6759	0.6636	0.664
0.4731	186.67	2800	0.6927	0.6556	0.656
0.4688	200.0	3000	0.7036	0.6690	0.669
0.4642	213.33	3200	0.7004	0.6579	0.658
0.4583	226.67	3400	0.6976	0.6557	0.656
0.4529	240.0	3600	0.7143	0.6559	0.656
0.449	253.33	3800	0.7127	0.6477	0.648
0.4429	266.67	4000	0.7309	0.6578	0.658
0.4371	280.0	4200	0.7469	0.6514	0.652
0.4317	293.33	4400	0.7238	0.6510	0.651
0.4266	306.67	4600	0.7404	0.6530	0.653
0.4216	320.0	4800	0.7518	0.6498	0.65
0.4165	333.33	5000	0.7623	0.6488	0.649
0.4119	346.67	5200	0.7583	0.6430	0.644
0.4069	360.0	5400	0.7826	0.6324	0.634
0.4046	373.33	5600	0.7873	0.6470	0.647
0.3982	386.67	5800	0.7936	0.6450	0.645
0.3961	400.0	6000	0.7770	0.6400	0.64
0.3908	413.33	6200	0.7884	0.6448	0.645
0.3876	426.67	6400	0.7895	0.6470	0.647
0.3831	440.0	6600	0.7965	0.6450	0.645
0.3799	453.33	6800	0.8196	0.6509	0.651
0.3769	466.67	7000	0.7986	0.6350	0.635
0.3748	480.0	7200	0.8324	0.64	0.64
0.3713	493.33	7400	0.8162	0.6410	0.641
0.3681	506.67	7600	0.8072	0.6409	0.641
0.3674	520.0	7800	0.8191	0.6458	0.646
0.3641	533.33	8000	0.8127	0.6460	0.646
0.3622	546.67	8200	0.8402	0.6440	0.644
0.3613	560.0	8400	0.8076	0.6400	0.64
0.3584	573.33	8600	0.8270	0.6490	0.649
0.3567	586.67	8800	0.8132	0.6530	0.653
0.3568	600.0	9000	0.8259	0.644	0.644
0.3553	613.33	9200	0.8248	0.6498	0.65
0.3548	626.67	9400	0.8155	0.6450	0.645
0.3529	640.0	9600	0.8233	0.6500	0.65
0.352	653.33	9800	0.8226	0.6490	0.649
0.3511	666.67	10000	0.8218	0.6460	0.646

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_1-seqsight_65536_512_47M-L32_all

GUE_tf_1-seqsight_65536_512_47M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results