1_1e-2_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9333
Accuracy: 0.7315

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.34	1.0	590	0.8462	0.5199
1.1867	2.0	1180	0.6498	0.6220
0.9301	3.0	1770	1.2304	0.3780
0.9674	4.0	2360	1.3949	0.6217
1.0253	5.0	2950	0.6352	0.6502
0.8515	6.0	3540	1.6753	0.6217
0.7695	7.0	4130	1.0653	0.5021
0.737	8.0	4720	0.6902	0.6190
0.7016	9.0	5310	0.5830	0.7
0.6402	10.0	5900	0.5490	0.7037
0.6369	11.0	6490	0.8935	0.6615
0.581	12.0	7080	0.5859	0.7089
0.5689	13.0	7670	0.5938	0.7116
0.516	14.0	8260	0.5614	0.7168
0.4991	15.0	8850	0.7467	0.6609
0.4822	16.0	9440	0.5836	0.7214
0.4744	17.0	10030	0.7603	0.6905
0.4437	18.0	10620	0.8842	0.6459
0.401	19.0	11210	0.6236	0.7257
0.3914	20.0	11800	0.8274	0.7205
0.371	21.0	12390	1.2395	0.6945
0.3668	22.0	12980	0.7150	0.7122
0.3137	23.0	13570	0.7551	0.7150
0.2999	24.0	14160	0.7089	0.7067
0.3049	25.0	14750	0.7955	0.7275
0.3005	26.0	15340	0.7884	0.7187
0.2951	27.0	15930	0.8277	0.7070
0.2577	28.0	16520	0.7660	0.7281
0.252	29.0	17110	0.7648	0.7269
0.2531	30.0	17700	0.8062	0.7251
0.2241	31.0	18290	0.9123	0.7177
0.2428	32.0	18880	1.4634	0.7110
0.2425	33.0	19470	0.8689	0.7211
0.2068	34.0	20060	0.8337	0.7119
0.2063	35.0	20650	0.9671	0.7245
0.2091	36.0	21240	0.8245	0.7245
0.2006	37.0	21830	0.9072	0.7291
0.1872	38.0	22420	0.8780	0.7202
0.1887	39.0	23010	0.9743	0.7147
0.1929	40.0	23600	1.1905	0.7275
0.1801	41.0	24190	0.9523	0.7281
0.1644	42.0	24780	0.9279	0.7162
0.1711	43.0	25370	0.9404	0.7245
0.1566	44.0	25960	0.9386	0.7284
0.1598	45.0	26550	0.9960	0.7104
0.1555	46.0	27140	1.0066	0.7122
0.1522	47.0	27730	0.9795	0.7052
0.1542	48.0	28320	0.9479	0.7226
0.1616	49.0	28910	0.9216	0.7232
0.146	50.0	29500	1.0475	0.7330
0.1328	51.0	30090	0.9752	0.7098
0.1334	52.0	30680	1.0264	0.7110
0.142	53.0	31270	0.9470	0.7327
0.1326	54.0	31860	0.9134	0.7333
0.1367	55.0	32450	0.9496	0.7217
0.1392	56.0	33040	0.9867	0.7306
0.118	57.0	33630	1.0509	0.7309
0.1222	58.0	34220	0.9824	0.7165
0.1162	59.0	34810	1.0020	0.7327
0.1275	60.0	35400	1.0136	0.7327
0.1233	61.0	35990	0.9981	0.7309
0.1167	62.0	36580	0.9955	0.7119
0.1113	63.0	37170	0.9447	0.7217
0.113	64.0	37760	1.0350	0.7275
0.1062	65.0	38350	0.9102	0.7367
0.1118	66.0	38940	1.0759	0.7070
0.0979	67.0	39530	0.9346	0.7324
0.1121	68.0	40120	1.0193	0.7229
0.0966	69.0	40710	1.0026	0.7263
0.0998	70.0	41300	1.0442	0.7297
0.0998	71.0	41890	0.9181	0.7266
0.0965	72.0	42480	0.9982	0.7144
0.0952	73.0	43070	0.9347	0.7183
0.0973	74.0	43660	1.0005	0.7242
0.0895	75.0	44250	1.0202	0.7376
0.0856	76.0	44840	0.9652	0.7312
0.0917	77.0	45430	1.0078	0.7330
0.091	78.0	46020	0.9855	0.7327
0.093	79.0	46610	0.9786	0.7370
0.0849	80.0	47200	0.9529	0.7407
0.0813	81.0	47790	0.9586	0.7303
0.0877	82.0	48380	0.9472	0.7349
0.0813	83.0	48970	0.9310	0.7303
0.0835	84.0	49560	0.9795	0.7361
0.0821	85.0	50150	0.9592	0.7346
0.0777	86.0	50740	0.9667	0.7303
0.0755	87.0	51330	0.9616	0.7343
0.0753	88.0	51920	0.9413	0.7336
0.0753	89.0	52510	0.9925	0.7284
0.0694	90.0	53100	0.9715	0.7358
0.0751	91.0	53690	0.9424	0.7300
0.072	92.0	54280	0.9396	0.7294
0.0715	93.0	54870	0.9579	0.7352
0.0735	94.0	55460	0.9577	0.7349
0.0694	95.0	56050	0.9331	0.7315
0.0665	96.0	56640	0.9441	0.7343
0.0655	97.0	57230	0.9610	0.7346
0.0649	98.0	57820	0.9345	0.7318
0.0689	99.0	58410	0.9403	0.7330
0.0669	100.0	59000	0.9333	0.7315

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_1e-2_1_0.1

1_1e-2_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_1e-2_1_0.1

Evaluation results