1_5e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9136
Accuracy: 0.7355

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.373	1.0	590	1.2296	0.3786
1.3539	2.0	1180	1.5833	0.6217
1.2392	3.0	1770	1.2689	0.3933
1.1294	4.0	2360	0.9251	0.6272
1.1134	5.0	2950	1.1215	0.6257
1.0908	6.0	3540	0.9055	0.6235
1.0395	7.0	4130	1.6555	0.3881
1.0251	8.0	4720	1.1363	0.6226
0.9902	9.0	5310	1.0739	0.4887
0.9501	10.0	5900	0.8761	0.6208
0.9437	11.0	6490	0.9379	0.6385
0.8883	12.0	7080	0.9268	0.5755
0.9089	13.0	7670	0.8405	0.6343
0.8623	14.0	8260	0.8633	0.6578
0.8454	15.0	8850	0.8616	0.6315
0.8373	16.0	9440	0.7891	0.6709
0.8356	17.0	10030	0.7889	0.6722
0.8187	18.0	10620	0.9561	0.6049
0.7986	19.0	11210	0.7897	0.6786
0.7958	20.0	11800	0.8889	0.6593
0.7742	21.0	12390	0.8762	0.6330
0.7514	22.0	12980	0.7717	0.6933
0.7505	23.0	13570	0.7587	0.6982
0.7105	24.0	14160	0.9016	0.6749
0.7027	25.0	14750	0.8744	0.6483
0.7159	26.0	15340	0.9018	0.6266
0.6908	27.0	15930	0.7527	0.7015
0.6603	28.0	16520	0.7971	0.6997
0.6599	29.0	17110	0.7492	0.7021
0.6621	30.0	17700	0.7845	0.7031
0.6357	31.0	18290	0.7578	0.7119
0.6159	32.0	18880	0.7800	0.7067
0.6181	33.0	19470	0.9263	0.6566
0.5866	34.0	20060	0.8543	0.6771
0.5708	35.0	20650	0.7777	0.7110
0.5784	36.0	21240	0.7719	0.7125
0.5395	37.0	21830	0.7532	0.7116
0.5579	38.0	22420	0.7451	0.7098
0.5113	39.0	23010	0.7618	0.7242
0.5329	40.0	23600	0.9580	0.7135
0.4996	41.0	24190	1.0449	0.6899
0.4889	42.0	24780	0.8325	0.7193
0.4905	43.0	25370	0.9896	0.7089
0.4866	44.0	25960	0.8897	0.6991
0.4652	45.0	26550	0.8080	0.7349
0.441	46.0	27140	0.7911	0.7309
0.45	47.0	27730	0.8294	0.7263
0.4149	48.0	28320	0.8578	0.7162
0.441	49.0	28910	0.8451	0.7284
0.4105	50.0	29500	0.9310	0.7245
0.403	51.0	30090	0.8326	0.7190
0.3872	52.0	30680	0.8510	0.7220
0.3717	53.0	31270	0.8455	0.7321
0.3856	54.0	31860	0.8331	0.7260
0.3808	55.0	32450	0.8245	0.7266
0.3805	56.0	33040	0.8482	0.7303
0.3481	57.0	33630	0.9800	0.6982
0.3549	58.0	34220	0.8415	0.7235
0.3497	59.0	34810	0.8914	0.7223
0.3447	60.0	35400	0.8756	0.7239
0.3398	61.0	35990	0.9337	0.7327
0.3266	62.0	36580	0.9014	0.7333
0.3186	63.0	37170	0.9030	0.7217
0.318	64.0	37760	0.8929	0.7220
0.2978	65.0	38350	0.9019	0.7324
0.3063	66.0	38940	0.8663	0.7232
0.2943	67.0	39530	0.9055	0.7199
0.3013	68.0	40120	0.8958	0.7269
0.2862	69.0	40710	0.9173	0.7287
0.3004	70.0	41300	0.8699	0.7254
0.2917	71.0	41890	0.8956	0.7284
0.2807	72.0	42480	0.9030	0.7321
0.2687	73.0	43070	0.9436	0.7199
0.2771	74.0	43660	0.9673	0.7165
0.2703	75.0	44250	1.0024	0.7373
0.2743	76.0	44840	0.8980	0.7349
0.2587	77.0	45430	0.8994	0.7312
0.2631	78.0	46020	0.9195	0.7339
0.255	79.0	46610	0.8869	0.7361
0.2546	80.0	47200	0.9206	0.7266
0.2412	81.0	47790	0.9025	0.7373
0.2516	82.0	48380	0.9041	0.7358
0.2472	83.0	48970	0.9345	0.7358
0.2455	84.0	49560	0.9110	0.7376
0.2447	85.0	50150	0.9245	0.7306
0.238	86.0	50740	0.9204	0.7391
0.238	87.0	51330	0.9557	0.7364
0.2337	88.0	51920	0.9187	0.7349
0.2313	89.0	52510	0.9249	0.7361
0.2249	90.0	53100	0.9316	0.7422
0.2279	91.0	53690	0.9483	0.7370
0.221	92.0	54280	0.9150	0.7388
0.2291	93.0	54870	0.9243	0.7376
0.2234	94.0	55460	0.9347	0.7398
0.2239	95.0	56050	0.9169	0.7358
0.2191	96.0	56640	0.9255	0.7367
0.2213	97.0	57230	0.9130	0.7321
0.218	98.0	57820	0.9197	0.7388
0.2148	99.0	58410	0.9183	0.7391
0.2197	100.0	59000	0.9136	0.7355

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_5_0.1

1_5e-3_5_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_5_0.1

Evaluation results