1_8e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9505
Accuracy: 0.7318

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.008
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.3959	1.0	590	0.9510	0.3786
1.0927	2.0	1180	0.6855	0.4780
0.9921	3.0	1770	1.4020	0.3783
1.039	4.0	2360	0.9930	0.3835
0.877	5.0	2950	1.3595	0.6217
0.8304	6.0	3540	0.6007	0.6648
0.7152	7.0	4130	1.3841	0.4086
0.7225	8.0	4720	0.7135	0.6183
0.6522	9.0	5310	0.5864	0.6966
0.6306	10.0	5900	1.1053	0.6318
0.6533	11.0	6490	0.6681	0.6939
0.5693	12.0	7080	0.6281	0.6777
0.569	13.0	7670	0.6301	0.6523
0.5168	14.0	8260	0.6110	0.6878
0.5071	15.0	8850	0.6350	0.7083
0.5042	16.0	9440	0.6348	0.7183
0.4678	17.0	10030	1.0429	0.6067
0.4545	18.0	10620	0.7921	0.6780
0.4216	19.0	11210	0.6437	0.7245
0.3986	20.0	11800	0.7142	0.7159
0.3871	21.0	12390	0.6949	0.7131
0.3852	22.0	12980	0.6870	0.7235
0.3519	23.0	13570	0.7979	0.7
0.3271	24.0	14160	0.9015	0.6875
0.3136	25.0	14750	0.8513	0.7092
0.278	26.0	15340	0.8899	0.6869
0.2931	27.0	15930	0.7898	0.7150
0.2712	28.0	16520	0.8953	0.7294
0.2494	29.0	17110	0.8243	0.7217
0.2568	30.0	17700	0.8979	0.7156
0.2488	31.0	18290	1.0504	0.7211
0.2568	32.0	18880	0.8953	0.7107
0.2465	33.0	19470	0.8415	0.7208
0.2077	34.0	20060	1.0351	0.7083
0.2202	35.0	20650	0.9620	0.7202
0.2224	36.0	21240	0.8594	0.7251
0.2133	37.0	21830	0.9035	0.7257
0.1881	38.0	22420	0.9327	0.7153
0.201	39.0	23010	0.9521	0.7220
0.197	40.0	23600	0.9997	0.7199
0.1949	41.0	24190	1.0048	0.7355
0.1739	42.0	24780	0.9031	0.7309
0.1781	43.0	25370	1.0229	0.7321
0.1726	44.0	25960	0.9823	0.7183
0.1472	45.0	26550	0.9605	0.7131
0.1628	46.0	27140	0.9855	0.7382
0.1658	47.0	27730	1.0724	0.7272
0.1563	48.0	28320	0.9809	0.7242
0.1682	49.0	28910	0.8878	0.7303
0.1432	50.0	29500	0.9983	0.7324
0.1437	51.0	30090	1.2073	0.6890
0.1431	52.0	30680	1.0315	0.7162
0.142	53.0	31270	1.0895	0.7370
0.1312	54.0	31860	0.9904	0.7355
0.1371	55.0	32450	0.9881	0.7159
0.1383	56.0	33040	0.9876	0.7443
0.128	57.0	33630	1.0126	0.7217
0.1256	58.0	34220	0.9730	0.7370
0.1283	59.0	34810	0.9943	0.7303
0.14	60.0	35400	0.9945	0.7278
0.126	61.0	35990	1.0015	0.7193
0.1232	62.0	36580	1.0385	0.7190
0.1163	63.0	37170	0.9850	0.7180
0.1204	64.0	37760	1.0085	0.7226
0.1157	65.0	38350	1.0784	0.7373
0.1154	66.0	38940	0.9773	0.7330
0.1101	67.0	39530	0.9884	0.7315
0.1138	68.0	40120	0.9496	0.7294
0.1064	69.0	40710	1.0320	0.7303
0.1031	70.0	41300	0.9621	0.7327
0.107	71.0	41890	0.9663	0.7349
0.107	72.0	42480	0.9714	0.7309
0.0958	73.0	43070	1.0255	0.7135
0.0973	74.0	43660	0.9705	0.7349
0.0989	75.0	44250	1.0003	0.7321
0.0968	76.0	44840	1.0130	0.7306
0.0947	77.0	45430	1.0245	0.7300
0.0976	78.0	46020	1.0305	0.7352
0.0916	79.0	46610	0.9644	0.7300
0.0913	80.0	47200	1.0130	0.7373
0.0911	81.0	47790	0.9241	0.7263
0.0985	82.0	48380	0.9843	0.7385
0.0876	83.0	48970	1.0069	0.7327
0.0865	84.0	49560	0.9806	0.7303
0.0872	85.0	50150	0.9590	0.7291
0.0818	86.0	50740	0.9917	0.7251
0.0828	87.0	51330	0.9569	0.7333
0.0813	88.0	51920	0.9769	0.7260
0.0763	89.0	52510	1.0162	0.7333
0.0795	90.0	53100	0.9829	0.7346
0.0788	91.0	53690	0.9755	0.7349
0.0769	92.0	54280	1.0030	0.7315
0.0739	93.0	54870	0.9772	0.7370
0.0782	94.0	55460	0.9850	0.7284
0.0746	95.0	56050	0.9688	0.7309
0.0749	96.0	56640	0.9492	0.7309
0.072	97.0	57230	0.9607	0.7303
0.0693	98.0	57820	0.9686	0.7318
0.0725	99.0	58410	0.9606	0.7312
0.0713	100.0	59000	0.9505	0.7318

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_8e-3_1_0.1

1_8e-3_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_8e-3_1_0.1

Evaluation results