1_5e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.5068
Accuracy: 0.7388

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.8736	1.0	590	1.0074	0.6217
0.8968	2.0	1180	1.0334	0.6217
0.8293	3.0	1770	0.6363	0.4920
0.7568	4.0	2360	0.6064	0.6232
0.66	5.0	2950	0.6124	0.6223
0.6953	6.0	3540	0.5216	0.6550
0.6411	7.0	4130	0.5622	0.6012
0.5966	8.0	4720	0.4958	0.6584
0.5765	9.0	5310	0.8209	0.6300
0.6133	10.0	5900	0.4712	0.6826
0.605	11.0	6490	0.4679	0.7034
0.5325	12.0	7080	0.7704	0.6443
0.5728	13.0	7670	0.5719	0.6024
0.5194	14.0	8260	0.8197	0.6535
0.501	15.0	8850	0.4650	0.6758
0.5197	16.0	9440	0.4482	0.6908
0.4824	17.0	10030	0.5545	0.6208
0.4937	18.0	10620	0.8156	0.5514
0.4855	19.0	11210	0.4380	0.7061
0.4705	20.0	11800	0.4712	0.7055
0.4481	21.0	12390	0.4595	0.7098
0.4624	22.0	12980	0.5374	0.6532
0.4222	23.0	13570	0.4828	0.6731
0.4293	24.0	14160	0.4509	0.7147
0.4082	25.0	14750	0.4616	0.7018
0.392	26.0	15340	0.4615	0.7061
0.4079	27.0	15930	0.4404	0.7278
0.3798	28.0	16520	0.5590	0.6691
0.4075	29.0	17110	0.5303	0.7122
0.3755	30.0	17700	0.4535	0.7312
0.3686	31.0	18290	0.5050	0.6771
0.3553	32.0	18880	0.4831	0.7269
0.3576	33.0	19470	0.4556	0.7177
0.343	34.0	20060	0.4762	0.7269
0.3275	35.0	20650	0.4346	0.7275
0.327	36.0	21240	0.4859	0.7269
0.3328	37.0	21830	0.4580	0.7080
0.3228	38.0	22420	0.4488	0.7266
0.3103	39.0	23010	0.4543	0.7379
0.2946	40.0	23600	0.4612	0.7379
0.3044	41.0	24190	0.5015	0.7352
0.3008	42.0	24780	0.4525	0.7281
0.2823	43.0	25370	0.5095	0.7278
0.2779	44.0	25960	0.4926	0.7095
0.2763	45.0	26550	0.4621	0.7343
0.2726	46.0	27140	0.4941	0.7343
0.2714	47.0	27730	0.4843	0.7187
0.2637	48.0	28320	0.5355	0.7336
0.2699	49.0	28910	0.4733	0.7355
0.2579	50.0	29500	0.4887	0.7187
0.2416	51.0	30090	0.4815	0.7211
0.248	52.0	30680	0.4938	0.7287
0.2424	53.0	31270	0.5618	0.6960
0.2333	54.0	31860	0.4903	0.7333
0.2392	55.0	32450	0.5097	0.7343
0.2481	56.0	33040	0.5276	0.7352
0.2291	57.0	33630	0.4934	0.7327
0.2181	58.0	34220	0.5084	0.7294
0.227	59.0	34810	0.5020	0.7266
0.2242	60.0	35400	0.5140	0.7315
0.2243	61.0	35990	0.5246	0.7297
0.2218	62.0	36580	0.4869	0.7275
0.2078	63.0	37170	0.4971	0.7187
0.2194	64.0	37760	0.5192	0.7251
0.2078	65.0	38350	0.5858	0.7410
0.2079	66.0	38940	0.5299	0.7361
0.2019	67.0	39530	0.4952	0.7306
0.2076	68.0	40120	0.5006	0.7324
0.2013	69.0	40710	0.5055	0.7343
0.2047	70.0	41300	0.5223	0.7336
0.2049	71.0	41890	0.5265	0.7162
0.1916	72.0	42480	0.5238	0.7407
0.1896	73.0	43070	0.4899	0.7361
0.19	74.0	43660	0.5060	0.7315
0.1918	75.0	44250	0.5260	0.7346
0.1877	76.0	44840	0.5053	0.7336
0.1952	77.0	45430	0.5019	0.7382
0.1851	78.0	46020	0.4942	0.7336
0.1862	79.0	46610	0.5213	0.7398
0.1833	80.0	47200	0.5167	0.7343
0.181	81.0	47790	0.5394	0.7358
0.186	82.0	48380	0.5684	0.7336
0.1825	83.0	48970	0.5106	0.7373
0.1713	84.0	49560	0.5482	0.7410
0.174	85.0	50150	0.5182	0.7385
0.1712	86.0	50740	0.5350	0.7376
0.1687	87.0	51330	0.5074	0.7391
0.172	88.0	51920	0.5126	0.7382
0.1702	89.0	52510	0.4916	0.7275
0.1695	90.0	53100	0.5229	0.7370
0.1705	91.0	53690	0.4987	0.7401
0.1703	92.0	54280	0.4968	0.7254
0.1696	93.0	54870	0.5109	0.7382
0.1651	94.0	55460	0.5180	0.7413
0.1623	95.0	56050	0.5017	0.7385
0.1659	96.0	56640	0.5077	0.7407
0.1592	97.0	57230	0.5173	0.7394
0.1608	98.0	57820	0.5034	0.7413
0.1599	99.0	58410	0.5079	0.7407
0.1638	100.0	59000	0.5068	0.7388

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_1_0.5

1_5e-3_1_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_1_0.5

Evaluation results