1_5e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9257
Accuracy: 0.7330

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.0669	1.0	590	0.6986	0.6217
0.9963	2.0	1180	1.8702	0.3792
1.0427	3.0	1770	2.2910	0.3798
0.8982	4.0	2360	0.7642	0.4159
0.9871	5.0	2950	0.9999	0.6217
0.8853	6.0	3540	0.6842	0.5278
0.8006	7.0	4130	1.4763	0.3878
0.7667	8.0	4720	0.8226	0.6239
0.7472	9.0	5310	0.7288	0.6364
0.7638	10.0	5900	0.5834	0.6636
0.7755	11.0	6490	1.6914	0.4269
0.6952	12.0	7080	0.9552	0.6324
0.7343	13.0	7670	0.5715	0.6835
0.6358	14.0	8260	1.0425	0.6284
0.6214	15.0	8850	0.6728	0.6807
0.7714	16.0	9440	0.5675	0.6991
0.6478	17.0	10030	0.6009	0.6976
0.6253	18.0	10620	0.5959	0.6942
0.5884	19.0	11210	0.6113	0.6896
0.6143	20.0	11800	0.5812	0.7165
0.5621	21.0	12390	0.5986	0.7125
0.561	22.0	12980	0.9897	0.5994
0.5203	23.0	13570	0.8431	0.6606
0.5278	24.0	14160	1.2396	0.5673
0.5013	25.0	14750	0.6779	0.6850
0.5121	26.0	15340	0.8150	0.6459
0.4987	27.0	15930	0.6473	0.7208
0.4915	28.0	16520	0.6165	0.6997
0.4362	29.0	17110	0.7189	0.6587
0.4401	30.0	17700	0.6948	0.7211
0.4488	31.0	18290	0.9311	0.6924
0.4593	32.0	18880	0.6527	0.7297
0.4209	33.0	19470	1.0135	0.6437
0.3953	34.0	20060	0.8262	0.7162
0.3813	35.0	20650	0.8390	0.6911
0.3916	36.0	21240	0.7626	0.7
0.3736	37.0	21830	0.6349	0.7199
0.3558	38.0	22420	0.6932	0.7284
0.378	39.0	23010	0.9384	0.6706
0.3104	40.0	23600	0.8561	0.7269
0.3366	41.0	24190	0.7296	0.7110
0.3089	42.0	24780	0.7695	0.7183
0.3099	43.0	25370	0.9426	0.6933
0.3225	44.0	25960	0.8238	0.7330
0.2853	45.0	26550	0.7910	0.7346
0.3031	46.0	27140	1.0613	0.6713
0.2865	47.0	27730	0.8105	0.7263
0.2736	48.0	28320	0.9241	0.7119
0.2892	49.0	28910	0.8532	0.7281
0.2582	50.0	29500	0.8393	0.7214
0.2631	51.0	30090	1.1566	0.6722
0.2496	52.0	30680	0.9162	0.6911
0.2501	53.0	31270	0.8305	0.7251
0.2362	54.0	31860	1.1556	0.6599
0.2325	55.0	32450	1.0032	0.6685
0.2539	56.0	33040	0.9128	0.7336
0.2231	57.0	33630	0.8328	0.7073
0.2123	58.0	34220	0.9290	0.7171
0.2093	59.0	34810	0.8650	0.7229
0.2151	60.0	35400	0.9212	0.7245
0.2074	61.0	35990	0.8884	0.7257
0.2072	62.0	36580	0.8822	0.7251
0.1898	63.0	37170	0.9609	0.7287
0.1936	64.0	37760	0.9800	0.6979
0.197	65.0	38350	1.0263	0.7125
0.1856	66.0	38940	0.9902	0.7404
0.1751	67.0	39530	0.8972	0.7312
0.1791	68.0	40120	1.0031	0.7248
0.1693	69.0	40710	1.0957	0.7361
0.1783	70.0	41300	1.0342	0.7349
0.1801	71.0	41890	1.0411	0.7067
0.1768	72.0	42480	0.9629	0.7211
0.1595	73.0	43070	0.9862	0.7370
0.154	74.0	43660	0.9240	0.7333
0.1578	75.0	44250	1.1158	0.7336
0.165	76.0	44840	0.9100	0.7358
0.1582	77.0	45430	0.9886	0.7324
0.1573	78.0	46020	1.0058	0.7193
0.1544	79.0	46610	0.9316	0.7199
0.1488	80.0	47200	0.9493	0.7196
0.141	81.0	47790	0.9467	0.7352
0.1479	82.0	48380	0.8841	0.7232
0.1377	83.0	48970	0.9072	0.7309
0.1372	84.0	49560	0.9831	0.7266
0.1389	85.0	50150	0.9714	0.7272
0.136	86.0	50740	0.9617	0.7364
0.1383	87.0	51330	0.9970	0.7257
0.1324	88.0	51920	0.8863	0.7190
0.1262	89.0	52510	0.9828	0.7336
0.132	90.0	53100	0.9576	0.7333
0.129	91.0	53690	0.9326	0.7321
0.1241	92.0	54280	0.9571	0.7278
0.1217	93.0	54870	0.9131	0.7306
0.1253	94.0	55460	0.9053	0.7315
0.1192	95.0	56050	0.9126	0.7349
0.1225	96.0	56640	0.9336	0.7355
0.1229	97.0	57230	0.9702	0.7272
0.1165	98.0	57820	0.9494	0.7339
0.1198	99.0	58410	0.9183	0.7324
0.1172	100.0	59000	0.9257	0.7330

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_1_0.1

1_5e-3_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_1_0.1

Evaluation results