1_6e-3_5_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9459
Accuracy: 0.7379

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.006
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.3359	1.0	590	1.9035	0.3798
1.5253	2.0	1180	0.9944	0.6217
1.219	3.0	1770	0.8943	0.6190
1.1715	4.0	2360	0.9268	0.6205
1.0411	5.0	2950	0.8576	0.6220
1.0356	6.0	3540	0.9342	0.6067
0.9697	7.0	4130	1.9873	0.4131
0.9799	8.0	4720	1.7366	0.4492
0.9846	9.0	5310	1.3262	0.6330
0.9154	10.0	5900	1.0899	0.5697
0.8903	11.0	6490	0.8476	0.6242
0.8245	12.0	7080	0.9154	0.6902
0.8927	13.0	7670	0.7204	0.6930
0.7654	14.0	8260	0.8502	0.6908
0.7533	15.0	8850	0.9376	0.6398
0.8225	16.0	9440	0.7376	0.7073
0.6919	17.0	10030	1.2361	0.5688
0.6861	18.0	10620	1.1219	0.6116
0.6514	19.0	11210	0.7409	0.7073
0.669	20.0	11800	1.1160	0.6379
0.6611	21.0	12390	0.8790	0.7156
0.6422	22.0	12980	0.9649	0.6550
0.5883	23.0	13570	1.1373	0.6324
0.5804	24.0	14160	1.2809	0.6156
0.5509	25.0	14750	0.8749	0.7229
0.5318	26.0	15340	0.8741	0.6969
0.5223	27.0	15930	0.7777	0.7168
0.4971	28.0	16520	0.8501	0.6985
0.4599	29.0	17110	0.8999	0.7156
0.4617	30.0	17700	0.8970	0.7297
0.4523	31.0	18290	0.9297	0.7171
0.4334	32.0	18880	0.9673	0.7315
0.4215	33.0	19470	0.8755	0.7263
0.4088	34.0	20060	0.9157	0.6988
0.3842	35.0	20650	1.0157	0.7349
0.3913	36.0	21240	0.8419	0.7300
0.3737	37.0	21830	0.7792	0.7266
0.373	38.0	22420	0.8775	0.7257
0.3718	39.0	23010	0.8662	0.7309
0.3449	40.0	23600	0.9173	0.7257
0.3585	41.0	24190	0.8719	0.7339
0.3299	42.0	24780	0.9434	0.7208
0.3137	43.0	25370	0.9660	0.7324
0.3228	44.0	25960	0.8873	0.7266
0.3134	45.0	26550	0.8953	0.7202
0.2873	46.0	27140	0.8243	0.7297
0.301	47.0	27730	0.8633	0.7324
0.271	48.0	28320	0.9646	0.7217
0.2907	49.0	28910	0.9321	0.7318
0.2785	50.0	29500	0.8440	0.7407
0.2554	51.0	30090	1.0258	0.7116
0.2715	52.0	30680	0.9458	0.7223
0.2556	53.0	31270	0.8895	0.7450
0.2488	54.0	31860	0.8865	0.7410
0.2528	55.0	32450	0.9360	0.7330
0.2444	56.0	33040	1.0095	0.7373
0.2391	57.0	33630	0.9704	0.7428
0.2386	58.0	34220	0.9717	0.7401
0.2193	59.0	34810	0.9480	0.7434
0.2338	60.0	35400	1.0054	0.7315
0.229	61.0	35990	0.8469	0.7361
0.2187	62.0	36580	0.8841	0.7324
0.2127	63.0	37170	0.9744	0.7260
0.2142	64.0	37760	0.9097	0.7407
0.2138	65.0	38350	0.9503	0.7281
0.2078	66.0	38940	0.8941	0.7379
0.2027	67.0	39530	0.8893	0.7379
0.2019	68.0	40120	0.9128	0.7333
0.1911	69.0	40710	0.9662	0.7382
0.2022	70.0	41300	1.0329	0.7388
0.1882	71.0	41890	0.9666	0.7232
0.2163	72.0	42480	0.9655	0.7333
0.1884	73.0	43070	0.9855	0.7254
0.1947	74.0	43660	0.9542	0.7324
0.1861	75.0	44250	0.9777	0.7413
0.1833	76.0	44840	0.9576	0.7388
0.1861	77.0	45430	0.9108	0.7404
0.1838	78.0	46020	0.9292	0.7352
0.1764	79.0	46610	0.9273	0.7413
0.1752	80.0	47200	0.9498	0.7355
0.1709	81.0	47790	0.9724	0.7343
0.1722	82.0	48380	0.8921	0.7364
0.1701	83.0	48970	1.0262	0.7398
0.168	84.0	49560	0.9239	0.7346
0.1633	85.0	50150	0.9714	0.7349
0.1666	86.0	50740	0.9723	0.7398
0.1634	87.0	51330	0.9497	0.7419
0.1657	88.0	51920	0.9417	0.7358
0.1481	89.0	52510	0.9709	0.7419
0.1557	90.0	53100	0.9928	0.7312
0.1567	91.0	53690	0.9443	0.7388
0.1568	92.0	54280	0.9285	0.7367
0.1579	93.0	54870	0.9201	0.7376
0.157	94.0	55460	0.9334	0.7376
0.1515	95.0	56050	0.9646	0.7394
0.1486	96.0	56640	0.9589	0.7385
0.1468	97.0	57230	0.9423	0.7379
0.1453	98.0	57820	0.9497	0.7382
0.1406	99.0	58410	0.9602	0.7373
0.1467	100.0	59000	0.9459	0.7379

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_6e-3_5_0.1

1_6e-3_5_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_6e-3_5_0.1

Evaluation results