GUE_tf_2-seqsight_65536_512_47M-L32_all

This model is a fine-tuned version of mahdibaghbanzadeh/seqsight_65536_512_47M on the mahdibaghbanzadeh/GUE_tf_2 dataset. It achieves the following results on the evaluation set:

Loss: 0.6364
F1 Score: 0.6850
Accuracy: 0.685

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 2048
eval_batch_size: 2048
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 10000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Score	Accuracy
0.6523	20.0	200	0.6602	0.6107	0.615
0.5889	40.0	400	0.6631	0.6142	0.617
0.5457	60.0	600	0.6771	0.6209	0.621
0.5153	80.0	800	0.6813	0.6135	0.614
0.4954	100.0	1000	0.6920	0.6136	0.614
0.4847	120.0	1200	0.6981	0.6200	0.62
0.4753	140.0	1400	0.6823	0.6309	0.631
0.4698	160.0	1600	0.7015	0.6447	0.645
0.4634	180.0	1800	0.6763	0.6356	0.636
0.4559	200.0	2000	0.6808	0.6389	0.639
0.4485	220.0	2200	0.7041	0.6408	0.641
0.4435	240.0	2400	0.6797	0.6558	0.656
0.4352	260.0	2600	0.7195	0.6430	0.643
0.4283	280.0	2800	0.7155	0.65	0.65
0.42	300.0	3000	0.7098	0.6516	0.653
0.4135	320.0	3200	0.7060	0.6532	0.655
0.4048	340.0	3400	0.7106	0.6423	0.644
0.3943	360.0	3600	0.7462	0.6417	0.642
0.3849	380.0	3800	0.7403	0.6528	0.653
0.3768	400.0	4000	0.7351	0.6432	0.645
0.3665	420.0	4200	0.7459	0.6371	0.638
0.3585	440.0	4400	0.7503	0.6372	0.64
0.3501	460.0	4600	0.7474	0.6424	0.643
0.3425	480.0	4800	0.7972	0.6375	0.638
0.3354	500.0	5000	0.7901	0.6448	0.645
0.3266	520.0	5200	0.8136	0.6310	0.631
0.32	540.0	5400	0.7967	0.6369	0.637
0.3145	560.0	5600	0.7992	0.6369	0.637
0.3082	580.0	5800	0.8255	0.6330	0.633
0.3038	600.0	6000	0.8006	0.6268	0.627
0.2966	620.0	6200	0.8352	0.6329	0.633
0.2906	640.0	6400	0.8417	0.6247	0.625
0.2872	660.0	6600	0.8195	0.6369	0.637
0.2801	680.0	6800	0.8518	0.6330	0.633
0.2764	700.0	7000	0.8594	0.638	0.638
0.2728	720.0	7200	0.8553	0.632	0.632
0.2662	740.0	7400	0.8691	0.6319	0.632
0.2665	760.0	7600	0.8889	0.6310	0.631
0.2623	780.0	7800	0.8657	0.63	0.63
0.2598	800.0	8000	0.8847	0.6280	0.628
0.2553	820.0	8200	0.8976	0.6270	0.627
0.2528	840.0	8400	0.8937	0.6320	0.632
0.2509	860.0	8600	0.8924	0.6370	0.637
0.248	880.0	8800	0.9017	0.6249	0.625
0.2473	900.0	9000	0.8995	0.6330	0.633
0.2459	920.0	9200	0.9111	0.6260	0.626
0.2453	940.0	9400	0.9009	0.6209	0.621
0.2441	960.0	9600	0.9082	0.6270	0.627
0.2433	980.0	9800	0.9084	0.6249	0.625
0.243	1000.0	10000	0.9080	0.6250	0.625

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

mahdibaghbanzadeh
/

GUE_tf_2-seqsight_65536_512_47M-L32_all

GUE_tf_2-seqsight_65536_512_47M-L32_all

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results