1_9e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.8873
Accuracy: 0.7443

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.009
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.2619	1.0	590	0.6470	0.6217
1.0747	2.0	1180	0.6993	0.4211
0.8969	3.0	1770	0.6604	0.5719
0.8368	4.0	2360	0.7051	0.5043
0.8124	5.0	2950	0.7117	0.6294
0.7078	6.0	3540	0.6893	0.6557
0.6885	7.0	4130	1.0081	0.4541
0.648	8.0	4720	0.5951	0.6951
0.6353	9.0	5310	0.6077	0.6624
0.6037	10.0	5900	0.5867	0.6920
0.5823	11.0	6490	0.5554	0.7024
0.5648	12.0	7080	0.5959	0.6602
0.5628	13.0	7670	0.5532	0.6966
0.5323	14.0	8260	0.5416	0.7107
0.5218	15.0	8850	0.5633	0.6969
0.505	16.0	9440	0.5292	0.7110
0.4968	17.0	10030	0.5375	0.7235
0.4821	18.0	10620	0.6966	0.6667
0.4692	19.0	11210	0.5588	0.7254
0.4651	20.0	11800	0.5620	0.7177
0.4215	21.0	12390	0.5768	0.7306
0.4361	22.0	12980	0.5720	0.7278
0.4138	23.0	13570	0.6098	0.7321
0.3883	24.0	14160	0.5691	0.7315
0.3852	25.0	14750	0.5940	0.7315
0.3691	26.0	15340	0.7810	0.6657
0.3689	27.0	15930	0.6396	0.7220
0.3413	28.0	16520	0.6304	0.7385
0.3333	29.0	17110	0.6135	0.7343
0.3259	30.0	17700	0.6418	0.7242
0.3049	31.0	18290	0.6385	0.7327
0.3203	32.0	18880	0.7961	0.7275
0.2978	33.0	19470	0.6375	0.7260
0.2831	34.0	20060	0.7307	0.7116
0.2782	35.0	20650	0.7057	0.7422
0.2668	36.0	21240	0.6802	0.7391
0.2673	37.0	21830	0.7305	0.7260
0.2478	38.0	22420	0.7019	0.7367
0.2481	39.0	23010	0.7238	0.7465
0.2406	40.0	23600	0.8325	0.7300
0.2344	41.0	24190	0.8143	0.7367
0.2151	42.0	24780	0.8423	0.7413
0.226	43.0	25370	0.7901	0.7343
0.2141	44.0	25960	0.8760	0.7355
0.2062	45.0	26550	0.8387	0.7416
0.192	46.0	27140	0.7825	0.7413
0.2045	47.0	27730	0.8157	0.7211
0.1922	48.0	28320	0.8735	0.7190
0.1967	49.0	28910	0.7669	0.7416
0.1814	50.0	29500	0.7925	0.7401
0.1814	51.0	30090	0.8249	0.7367
0.1721	52.0	30680	0.8772	0.7352
0.1607	53.0	31270	0.8614	0.7355
0.162	54.0	31860	0.8165	0.7376
0.1745	55.0	32450	0.8330	0.7287
0.1644	56.0	33040	0.8343	0.7370
0.1478	57.0	33630	0.8965	0.7318
0.1571	58.0	34220	0.9214	0.7232
0.1506	59.0	34810	0.9052	0.7401
0.1469	60.0	35400	0.8536	0.7428
0.1472	61.0	35990	0.8885	0.7309
0.1408	62.0	36580	0.8733	0.7413
0.1356	63.0	37170	0.9329	0.7214
0.1445	64.0	37760	0.8954	0.7480
0.1398	65.0	38350	0.8575	0.7391
0.1389	66.0	38940	0.8679	0.7422
0.1278	67.0	39530	0.9074	0.7446
0.1337	68.0	40120	0.8901	0.7346
0.123	69.0	40710	0.9254	0.7453
0.1362	70.0	41300	0.8586	0.7388
0.1214	71.0	41890	0.9126	0.7321
0.1245	72.0	42480	0.8943	0.7394
0.1142	73.0	43070	0.9241	0.7349
0.1227	74.0	43660	0.9128	0.7391
0.1121	75.0	44250	0.8904	0.7373
0.1172	76.0	44840	0.9219	0.7404
0.1122	77.0	45430	0.9410	0.7486
0.1047	78.0	46020	0.8903	0.7379
0.1088	79.0	46610	0.9508	0.7330
0.1076	80.0	47200	0.8921	0.7416
0.0986	81.0	47790	0.8941	0.7327
0.1037	82.0	48380	0.9029	0.7343
0.0983	83.0	48970	0.8863	0.7370
0.104	84.0	49560	0.8850	0.7361
0.0996	85.0	50150	0.9146	0.7453
0.0994	86.0	50740	0.8958	0.7355
0.0905	87.0	51330	0.8989	0.7474
0.0953	88.0	51920	0.9067	0.7422
0.0952	89.0	52510	0.9108	0.7410
0.0947	90.0	53100	0.9015	0.7382
0.09	91.0	53690	0.8984	0.7431
0.0936	92.0	54280	0.8893	0.7339
0.0908	93.0	54870	0.8919	0.7367
0.0872	94.0	55460	0.9024	0.7450
0.0847	95.0	56050	0.9029	0.7364
0.0901	96.0	56640	0.9023	0.7385
0.085	97.0	57230	0.8978	0.7370
0.0852	98.0	57820	0.8812	0.7413
0.0887	99.0	58410	0.8885	0.7385
0.0855	100.0	59000	0.8873	0.7443

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_9e-3_1_0.1

1_9e-3_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_9e-3_1_0.1

Evaluation results