2_7e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.6157
Accuracy: 0.7196

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.007
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.0957	1.0	590	1.6286	0.6214
1.1171	2.0	1180	0.6747	0.6217
0.8947	3.0	1770	2.3203	0.3786
0.9598	4.0	2360	0.9842	0.6217
0.9065	5.0	2950	0.7703	0.3933
0.9278	6.0	3540	0.6835	0.6217
0.8667	7.0	4130	1.2649	0.3783
0.9028	8.0	4720	0.8041	0.4847
0.8376	9.0	5310	0.6376	0.6382
0.8633	10.0	5900	0.8873	0.6346
0.8114	11.0	6490	0.6563	0.6517
0.7774	12.0	7080	0.6721	0.5927
0.7993	13.0	7670	0.7169	0.5593
0.783	14.0	8260	0.8230	0.6217
0.7426	15.0	8850	0.8903	0.6471
0.7765	16.0	9440	0.6656	0.5972
0.7135	17.0	10030	0.6012	0.6835
0.7211	18.0	10620	0.7250	0.6263
0.6977	19.0	11210	0.6059	0.6942
0.7171	20.0	11800	0.6088	0.6746
0.6492	21.0	12390	0.6587	0.6529
0.6865	22.0	12980	0.7926	0.6306
0.6446	23.0	13570	0.7486	0.6373
0.6424	24.0	14160	0.5743	0.6920
0.6075	25.0	14750	0.6606	0.7116
0.5918	26.0	15340	0.9846	0.5734
0.6047	27.0	15930	0.7312	0.6327
0.5819	28.0	16520	0.6141	0.7131
0.5636	29.0	17110	0.6814	0.7061
0.5673	30.0	17700	0.6304	0.7208
0.5631	31.0	18290	0.5952	0.6994
0.5297	32.0	18880	0.6358	0.7055
0.5253	33.0	19470	0.6810	0.6801
0.5226	34.0	20060	0.6240	0.7196
0.5117	35.0	20650	0.6342	0.6966
0.5066	36.0	21240	0.5623	0.7177
0.4968	37.0	21830	0.5724	0.7153
0.4829	38.0	22420	0.6402	0.7257
0.4892	39.0	23010	0.6528	0.7266
0.4782	40.0	23600	0.9618	0.7003
0.4845	41.0	24190	0.7193	0.7205
0.4742	42.0	24780	0.6461	0.7089
0.4564	43.0	25370	0.5987	0.7260
0.4592	44.0	25960	0.6792	0.7031
0.4402	45.0	26550	0.6405	0.7187
0.4314	46.0	27140	0.6285	0.7193
0.4351	47.0	27730	0.6312	0.7217
0.4366	48.0	28320	0.6445	0.7177
0.4315	49.0	28910	0.5979	0.7281
0.4207	50.0	29500	0.6114	0.7232
0.4099	51.0	30090	0.6984	0.7083
0.4018	52.0	30680	0.6533	0.7125
0.3998	53.0	31270	0.6237	0.7174
0.3978	54.0	31860	0.6144	0.7214
0.3975	55.0	32450	0.6166	0.7245
0.396	56.0	33040	0.6707	0.7138
0.3958	57.0	33630	0.6091	0.7187
0.3901	58.0	34220	0.6157	0.7202
0.3816	59.0	34810	0.6077	0.7239
0.3754	60.0	35400	0.6157	0.7196

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_7e-3_1_0.1

2_7e-3_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_7e-3_1_0.1

Evaluation results