2_5e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.8743
Accuracy: 0.7407

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.0951	1.0	590	2.8478	0.6208
2.0966	2.0	1180	2.0402	0.6208
1.9864	3.0	1770	2.9563	0.4196
1.9962	4.0	2360	2.4148	0.4905
1.8743	5.0	2950	2.1057	0.6217
1.562	6.0	3540	1.6253	0.6636
1.4913	7.0	4130	1.4832	0.6734
1.4114	8.0	4720	1.4386	0.6560
1.3732	9.0	5310	1.4139	0.6508
1.3161	10.0	5900	1.3009	0.6893
1.2979	11.0	6490	1.2760	0.6963
1.1837	12.0	7080	1.2606	0.6737
1.2171	13.0	7670	1.2241	0.7040
1.1545	14.0	8260	1.2533	0.7086
1.1424	15.0	8850	1.1613	0.7061
1.1106	16.0	9440	1.1290	0.7018
1.0798	17.0	10030	1.1366	0.7049
1.0665	18.0	10620	1.1030	0.7147
1.0642	19.0	11210	1.1100	0.7168
1.0498	20.0	11800	1.1124	0.7235
0.9966	21.0	12390	1.1192	0.7211
1.0178	22.0	12980	1.0786	0.7211
0.9956	23.0	13570	1.0710	0.7024
0.9896	24.0	14160	1.0254	0.7211
0.9496	25.0	14750	1.0181	0.7217
0.9755	26.0	15340	1.0013	0.7211
0.9439	27.0	15930	1.0014	0.7153
0.9151	28.0	16520	0.9923	0.7336
0.8988	29.0	17110	0.9776	0.7318
0.8962	30.0	17700	0.9625	0.7401
0.8825	31.0	18290	0.9702	0.7346
0.8734	32.0	18880	0.9766	0.7394
0.8651	33.0	19470	0.9443	0.7394
0.8404	34.0	20060	0.9665	0.7364
0.8312	35.0	20650	0.9290	0.7370
0.8401	36.0	21240	0.9546	0.7309
0.8121	37.0	21830	0.9287	0.7391
0.8162	38.0	22420	0.9171	0.7278
0.8096	39.0	23010	0.9196	0.7428
0.7901	40.0	23600	0.9168	0.7422
0.8011	41.0	24190	0.9136	0.7297
0.7908	42.0	24780	0.9080	0.7385
0.7755	43.0	25370	0.9270	0.7446
0.786	44.0	25960	0.8954	0.7333
0.7664	45.0	26550	0.9038	0.7410
0.7725	46.0	27140	0.8874	0.7431
0.7607	47.0	27730	0.9019	0.7416
0.7683	48.0	28320	0.9069	0.7456
0.7594	49.0	28910	0.9003	0.7318
0.7317	50.0	29500	0.8860	0.7428
0.7306	51.0	30090	0.8862	0.7434
0.736	52.0	30680	0.8952	0.7471
0.7343	53.0	31270	0.8761	0.7419
0.7248	54.0	31860	0.8876	0.7309
0.7334	55.0	32450	0.8841	0.7431
0.7458	56.0	33040	0.8817	0.7434
0.727	57.0	33630	0.8743	0.7431
0.7077	58.0	34220	0.8741	0.7422
0.7158	59.0	34810	0.8768	0.7446
0.7061	60.0	35400	0.8743	0.7407

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_5e-3_10_0.5

2_5e-3_10_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_5e-3_10_0.5

Evaluation results