1_1e-2_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9213
Accuracy: 0.7489

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.8284	1.0	590	2.0796	0.6220
1.4411	2.0	1180	1.1449	0.6220
1.3365	3.0	1770	1.0330	0.6217
1.305	4.0	2360	0.9705	0.6349
1.1782	5.0	2950	0.9411	0.6339
1.1021	6.0	3540	1.4542	0.6223
1.091	7.0	4130	1.3703	0.4969
0.9725	8.0	4720	1.4839	0.6425
0.9313	9.0	5310	0.7887	0.7009
0.8889	10.0	5900	0.8354	0.7052
0.8457	11.0	6490	0.8120	0.6807
0.7264	12.0	7080	0.9915	0.6190
0.7354	13.0	7670	0.7554	0.7205
0.686	14.0	8260	0.8069	0.7183
0.6549	15.0	8850	0.7395	0.7379
0.6278	16.0	9440	0.7282	0.7275
0.5753	17.0	10030	0.9035	0.6795
0.5773	18.0	10620	0.8699	0.6887
0.5437	19.0	11210	0.7501	0.7226
0.5266	20.0	11800	0.9360	0.7336
0.509	21.0	12390	0.8204	0.7199
0.497	22.0	12980	0.7944	0.7343
0.4379	23.0	13570	0.8074	0.7147
0.4276	24.0	14160	0.8147	0.7306
0.4132	25.0	14750	0.8578	0.7373
0.3944	26.0	15340	0.9502	0.7015
0.3845	27.0	15930	0.8962	0.7021
0.3754	28.0	16520	0.8571	0.7275
0.3478	29.0	17110	0.8433	0.7373
0.3561	30.0	17700	0.8819	0.7327
0.3301	31.0	18290	0.8623	0.7382
0.3217	32.0	18880	0.9132	0.7419
0.3182	33.0	19470	0.9184	0.7281
0.2892	34.0	20060	0.8482	0.7358
0.2915	35.0	20650	0.8988	0.7474
0.2816	36.0	21240	0.8834	0.7446
0.2763	37.0	21830	0.9208	0.7251
0.2679	38.0	22420	0.8656	0.7379
0.2785	39.0	23010	0.9177	0.7315
0.2551	40.0	23600	0.9989	0.7508
0.2491	41.0	24190	0.9483	0.7505
0.2482	42.0	24780	0.8921	0.7391
0.2577	43.0	25370	0.9175	0.7459
0.24	44.0	25960	0.9345	0.7453
0.2368	45.0	26550	0.9161	0.7428
0.2261	46.0	27140	0.8859	0.7315
0.2317	47.0	27730	0.8984	0.7437
0.218	48.0	28320	0.8986	0.7465
0.224	49.0	28910	0.8665	0.7431
0.2064	50.0	29500	0.8869	0.7492
0.2163	51.0	30090	0.8786	0.7394
0.2145	52.0	30680	0.9545	0.7446
0.1998	53.0	31270	0.8586	0.7462
0.2008	54.0	31860	0.9008	0.7446
0.1978	55.0	32450	0.9236	0.7471
0.2025	56.0	33040	0.8906	0.7474
0.1903	57.0	33630	0.9517	0.7459
0.1846	58.0	34220	0.9696	0.7529
0.1819	59.0	34810	0.9163	0.7419
0.1883	60.0	35400	0.9419	0.7373
0.1851	61.0	35990	0.9657	0.7419
0.1805	62.0	36580	0.9279	0.7413
0.1866	63.0	37170	0.8996	0.7495
0.1752	64.0	37760	0.9427	0.7554
0.1703	65.0	38350	0.9364	0.7379
0.1702	66.0	38940	0.9546	0.7502
0.1688	67.0	39530	0.9265	0.7498
0.1724	68.0	40120	0.9043	0.7446
0.1635	69.0	40710	0.9426	0.7465
0.1652	70.0	41300	0.9702	0.7471
0.1643	71.0	41890	0.9191	0.7379
0.1684	72.0	42480	0.9362	0.7526
0.1575	73.0	43070	0.9399	0.7511
0.1585	74.0	43660	0.9585	0.7483
0.1551	75.0	44250	0.9481	0.7532
0.1587	76.0	44840	0.9233	0.7483
0.1499	77.0	45430	0.9115	0.7508
0.1541	78.0	46020	0.9531	0.7535
0.1505	79.0	46610	0.9306	0.7456
0.1521	80.0	47200	0.9185	0.7535
0.1448	81.0	47790	0.9228	0.7459
0.1475	82.0	48380	0.9214	0.7446
0.1491	83.0	48970	0.9355	0.7465
0.1433	84.0	49560	0.9403	0.7523
0.1416	85.0	50150	0.9270	0.7492
0.1391	86.0	50740	0.9208	0.7517
0.1391	87.0	51330	0.9134	0.7517
0.1415	88.0	51920	0.9198	0.7486
0.1343	89.0	52510	0.9380	0.7483
0.128	90.0	53100	0.9429	0.7505
0.1328	91.0	53690	0.9211	0.7529
0.1311	92.0	54280	0.9180	0.7431
0.1383	93.0	54870	0.9522	0.7535
0.133	94.0	55460	0.9047	0.7486
0.1331	95.0	56050	0.9339	0.7526
0.1304	96.0	56640	0.9177	0.7480
0.1293	97.0	57230	0.9194	0.7471
0.128	98.0	57820	0.9213	0.7492
0.1268	99.0	58410	0.9260	0.7492
0.1297	100.0	59000	0.9213	0.7489

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_1e-2_10_0.1

1_1e-2_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_1e-2_10_0.1

Evaluation results