2_4e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.5293
Accuracy: 0.7272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.004
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.8756	1.0	590	0.9984	0.6211
0.8309	2.0	1180	0.7494	0.6217
0.8162	3.0	1770	0.8910	0.3826
0.8025	4.0	2360	0.6504	0.6028
0.8059	5.0	2950	0.6535	0.5945
0.768	6.0	3540	0.6293	0.6291
0.7423	7.0	4130	0.9356	0.4339
0.7272	8.0	4720	0.7985	0.6220
0.7076	9.0	5310	0.6240	0.6541
0.6803	10.0	5900	0.6284	0.6639
0.6637	11.0	6490	0.6013	0.6691
0.6217	12.0	7080	0.5783	0.6725
0.6169	13.0	7670	0.5657	0.6841
0.5962	14.0	8260	0.6273	0.6618
0.5937	15.0	8850	0.5982	0.6725
0.5811	16.0	9440	0.6778	0.5997
0.5534	17.0	10030	0.5478	0.7028
0.5641	18.0	10620	0.5615	0.7034
0.5588	19.0	11210	0.5467	0.7076
0.5611	20.0	11800	0.5505	0.7058
0.5423	21.0	12390	0.5617	0.7086
0.5372	22.0	12980	0.5483	0.7003
0.5387	23.0	13570	0.5560	0.7113
0.5274	24.0	14160	0.5278	0.7131
0.5242	25.0	14750	0.5377	0.7150
0.5256	26.0	15340	0.5796	0.6856
0.5203	27.0	15930	0.5456	0.6976
0.5087	28.0	16520	0.5365	0.7199
0.5127	29.0	17110	0.5419	0.7049
0.5005	30.0	17700	0.5417	0.7257
0.5008	31.0	18290	0.5257	0.7116
0.4959	32.0	18880	0.5463	0.7232
0.4931	33.0	19470	0.5251	0.7260
0.4849	34.0	20060	0.5282	0.7217
0.4733	35.0	20650	0.5296	0.7199
0.4842	36.0	21240	0.5230	0.7229
0.4811	37.0	21830	0.5264	0.7232
0.4683	38.0	22420	0.5518	0.7058
0.4692	39.0	23010	0.5256	0.7300
0.4621	40.0	23600	0.5292	0.7303
0.4624	41.0	24190	0.5467	0.7110
0.4618	42.0	24780	0.5189	0.7324
0.465	43.0	25370	0.5285	0.7330
0.453	44.0	25960	0.5577	0.7113
0.4533	45.0	26550	0.5170	0.7343
0.4524	46.0	27140	0.5219	0.7223
0.4454	47.0	27730	0.5367	0.7257
0.4401	48.0	28320	0.5251	0.7339
0.4547	49.0	28910	0.5300	0.7254
0.4374	50.0	29500	0.5318	0.7278
0.444	51.0	30090	0.5317	0.7239
0.4363	52.0	30680	0.5309	0.7306
0.4381	53.0	31270	0.5206	0.7312
0.4314	54.0	31860	0.5283	0.7269
0.4334	55.0	32450	0.5254	0.7278
0.43	56.0	33040	0.5317	0.7278
0.4194	57.0	33630	0.5261	0.7272
0.4341	58.0	34220	0.5266	0.7300
0.4243	59.0	34810	0.5269	0.7275
0.4191	60.0	35400	0.5293	0.7272

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_4e-3_1_0.1

2_4e-3_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_4e-3_1_0.1

Evaluation results