1_8e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9466
Accuracy: 0.7446

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.008
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.6159	1.0	590	1.3915	0.6214
1.4042	2.0	1180	1.2318	0.3810
1.2048	3.0	1770	0.9197	0.5642
1.2385	4.0	2360	0.9595	0.6220
1.1978	5.0	2950	1.2082	0.6220
1.1014	6.0	3540	1.3630	0.4590
1.0282	7.0	4130	1.1057	0.5538
0.9517	8.0	4720	0.9745	0.6789
0.9333	9.0	5310	0.7981	0.7040
0.8832	10.0	5900	0.7960	0.6979
0.8637	11.0	6490	0.7471	0.6920
0.8329	12.0	7080	0.7465	0.7104
0.7866	13.0	7670	0.7123	0.7034
0.7031	14.0	8260	0.8286	0.7089
0.6925	15.0	8850	0.7817	0.7061
0.6896	16.0	9440	0.7579	0.6963
0.6103	17.0	10030	0.8758	0.6563
0.6307	18.0	10620	1.1495	0.6211
0.5815	19.0	11210	0.7249	0.7315
0.554	20.0	11800	1.1488	0.6862
0.5376	21.0	12390	0.8074	0.7303
0.4969	22.0	12980	0.8280	0.6969
0.4813	23.0	13570	0.7972	0.7235
0.457	24.0	14160	0.8829	0.6807
0.4489	25.0	14750	0.7627	0.7303
0.4306	26.0	15340	0.9458	0.6945
0.4171	27.0	15930	1.0878	0.6823
0.4069	28.0	16520	0.8638	0.7125
0.3713	29.0	17110	0.9637	0.7306
0.3471	30.0	17700	0.8357	0.7205
0.341	31.0	18290	0.8430	0.7355
0.3677	32.0	18880	0.8911	0.7199
0.329	33.0	19470	1.0170	0.7
0.3019	34.0	20060	0.8981	0.7214
0.2912	35.0	20650	0.8809	0.7306
0.2962	36.0	21240	0.9446	0.7327
0.3018	37.0	21830	0.9218	0.7254
0.2793	38.0	22420	0.8054	0.7327
0.2786	39.0	23010	0.9709	0.7180
0.2608	40.0	23600	1.0428	0.7407
0.2705	41.0	24190	1.2935	0.7266
0.2551	42.0	24780	0.8896	0.7294
0.2383	43.0	25370	0.9849	0.7361
0.2306	44.0	25960	0.9547	0.7278
0.23	45.0	26550	0.9607	0.7373
0.2192	46.0	27140	0.9475	0.7248
0.2276	47.0	27730	0.9442	0.7333
0.2129	48.0	28320	0.9928	0.7294
0.2245	49.0	28910	0.9539	0.7324
0.2229	50.0	29500	0.9369	0.7245
0.2036	51.0	30090	1.0106	0.7239
0.206	52.0	30680	0.9619	0.7410
0.2056	53.0	31270	0.9298	0.7376
0.2007	54.0	31860	0.9451	0.7333
0.1953	55.0	32450	0.9762	0.7223
0.1992	56.0	33040	0.9447	0.7416
0.1806	57.0	33630	0.9956	0.7440
0.1859	58.0	34220	1.0206	0.7391
0.191	59.0	34810	0.9121	0.7385
0.1729	60.0	35400	0.9958	0.7278
0.1773	61.0	35990	0.9859	0.7428
0.1738	62.0	36580	0.9922	0.7398
0.1709	63.0	37170	0.9094	0.7419
0.1734	64.0	37760	0.9329	0.7431
0.1698	65.0	38350	0.9349	0.7391
0.1614	66.0	38940	1.0098	0.7327
0.1609	67.0	39530	0.9705	0.7269
0.1606	68.0	40120	0.9001	0.7425
0.1564	69.0	40710	0.9798	0.7407
0.1588	70.0	41300	0.9898	0.7382
0.1585	71.0	41890	0.9410	0.7410
0.1554	72.0	42480	0.9762	0.7404
0.1471	73.0	43070	0.9262	0.7401
0.1474	74.0	43660	0.8916	0.7410
0.1504	75.0	44250	0.9635	0.7385
0.1482	76.0	44840	0.9420	0.7413
0.1538	77.0	45430	0.9594	0.7413
0.1426	78.0	46020	0.9633	0.7440
0.1419	79.0	46610	0.9489	0.7437
0.1452	80.0	47200	0.9420	0.7398
0.1437	81.0	47790	0.9826	0.7410
0.1489	82.0	48380	0.9691	0.7453
0.1386	83.0	48970	0.9704	0.7398
0.1341	84.0	49560	0.8968	0.7398
0.1345	85.0	50150	0.9537	0.7367
0.1314	86.0	50740	0.9844	0.7453
0.1291	87.0	51330	0.9527	0.7379
0.1286	88.0	51920	0.9672	0.7419
0.1261	89.0	52510	0.9531	0.7379
0.1274	90.0	53100	0.9543	0.7419
0.1227	91.0	53690	0.9765	0.7422
0.1276	92.0	54280	0.9331	0.7388
0.1222	93.0	54870	0.9318	0.7425
0.1287	94.0	55460	0.9397	0.7437
0.1207	95.0	56050	0.9776	0.7410
0.1213	96.0	56640	0.9470	0.7462
0.1243	97.0	57230	0.9408	0.7428
0.1197	98.0	57820	0.9454	0.7450
0.1251	99.0	58410	0.9556	0.7428
0.1173	100.0	59000	0.9466	0.7446

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_8e-3_5_0.1

1_8e-3_5_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_8e-3_5_0.1

Evaluation results