2_1e-2_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9669
Accuracy: 0.7291

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.7272	1.0	590	2.1134	0.4018
2.2666	2.0	1180	3.2261	0.3783
2.3033	3.0	1770	2.2480	0.3783
2.1786	4.0	2360	2.7497	0.6208
2.1516	5.0	2950	1.7255	0.6492
1.9363	6.0	3540	3.4672	0.3783
2.0556	7.0	4130	2.9543	0.4664
2.0717	8.0	4720	1.9668	0.6297
2.238	9.0	5310	2.0150	0.6376
2.0674	10.0	5900	1.9047	0.6419
1.9777	11.0	6490	1.8100	0.6104
1.8447	12.0	7080	1.7533	0.6367
1.9655	13.0	7670	1.5246	0.6612
1.7583	14.0	8260	1.4859	0.6508
1.6346	15.0	8850	2.1240	0.6869
1.6424	16.0	9440	1.4976	0.6474
1.5083	17.0	10030	1.2798	0.6939
1.6096	18.0	10620	1.8015	0.6278
1.6952	19.0	11210	1.6068	0.6774
1.6535	20.0	11800	1.7095	0.6076
1.544	21.0	12390	1.4624	0.6832
1.5493	22.0	12980	1.3701	0.7015
1.4743	23.0	13570	1.3619	0.7040
1.4021	24.0	14160	1.2429	0.6832
1.3916	25.0	14750	1.4104	0.6853
1.3976	26.0	15340	1.3662	0.6621
1.4054	27.0	15930	1.3757	0.6382
1.282	28.0	16520	1.3488	0.6639
1.2595	29.0	17110	1.1823	0.6988
1.2441	30.0	17700	1.3444	0.7180
1.1883	31.0	18290	1.1253	0.7083
1.188	32.0	18880	1.1578	0.7229
1.1719	33.0	19470	1.2075	0.6884
1.1201	34.0	20060	1.0837	0.7156
1.1222	35.0	20650	1.1085	0.7015
1.0624	36.0	21240	1.3319	0.7196
1.0747	37.0	21830	1.3808	0.6560
1.028	38.0	22420	1.1399	0.7242
1.0343	39.0	23010	1.0303	0.7101
0.9876	40.0	23600	1.1261	0.7275
0.9899	41.0	24190	1.4611	0.7235
0.9883	42.0	24780	1.1315	0.7333
0.9558	43.0	25370	1.0614	0.7040
0.9663	44.0	25960	1.0889	0.7131
0.9311	45.0	26550	0.9791	0.7235
0.9269	46.0	27140	0.9895	0.7254
0.8845	47.0	27730	0.9648	0.7336
0.9076	48.0	28320	0.9665	0.7343
0.8691	49.0	28910	0.9858	0.7339
0.8558	50.0	29500	0.9660	0.7239
0.8443	51.0	30090	0.9774	0.7294
0.8341	52.0	30680	1.0947	0.7024
0.8268	53.0	31270	1.0108	0.7315
0.8243	54.0	31860	0.9856	0.7260
0.8072	55.0	32450	1.0354	0.7199
0.807	56.0	33040	0.9688	0.7269
0.8015	57.0	33630	0.9622	0.7291
0.771	58.0	34220	0.9676	0.7269
0.7829	59.0	34810	0.9740	0.7321
0.7862	60.0	35400	0.9669	0.7291

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_1e-2_10_0.5

2_1e-2_10_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_1e-2_10_0.5

Evaluation results