1_8e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 1.0109
Accuracy: 0.7272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.008
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.8619	1.0	590	1.0251	0.4685
1.3275	2.0	1180	1.3329	0.3795
1.2711	3.0	1770	1.3427	0.3817
1.2563	4.0	2360	0.9486	0.6352
1.3677	5.0	2950	1.5968	0.4266
1.2101	6.0	3540	2.8999	0.6217
1.2131	7.0	4130	1.7592	0.4410
1.0951	8.0	4720	1.0889	0.6535
1.1265	9.0	5310	1.6306	0.4963
1.0834	10.0	5900	0.8228	0.6789
0.9934	11.0	6490	0.9519	0.6789
0.9867	12.0	7080	1.2001	0.6471
0.9321	13.0	7670	0.7980	0.6850
0.914	14.0	8260	0.7659	0.7092
0.9005	15.0	8850	0.8234	0.7104
0.8728	16.0	9440	0.9553	0.6948
0.7346	17.0	10030	2.0394	0.5012
0.8001	18.0	10620	1.2116	0.6180
0.8778	19.0	11210	0.8516	0.6823
0.7117	20.0	11800	1.1178	0.6251
0.6709	21.0	12390	0.8929	0.7125
0.7554	22.0	12980	0.9317	0.6801
0.7167	23.0	13570	1.3876	0.6061
0.6239	24.0	14160	0.9124	0.6737
0.6273	25.0	14750	0.8818	0.7242
0.5882	26.0	15340	1.0614	0.6728
0.5567	27.0	15930	1.0177	0.7306
0.5606	28.0	16520	1.3018	0.6459
0.5559	29.0	17110	1.4926	0.6914
0.4879	30.0	17700	0.9648	0.6924
0.4945	31.0	18290	0.9028	0.7150
0.4876	32.0	18880	0.8188	0.7257
0.455	33.0	19470	1.0325	0.7312
0.468	34.0	20060	0.9495	0.7330
0.4324	35.0	20650	0.8765	0.7202
0.4098	36.0	21240	1.5105	0.6963
0.4002	37.0	21830	0.9019	0.7309
0.4077	38.0	22420	0.8470	0.7223
0.378	39.0	23010	0.9477	0.7196
0.3697	40.0	23600	0.9213	0.7226
0.3957	41.0	24190	0.9321	0.7260
0.338	42.0	24780	0.8633	0.7284
0.343	43.0	25370	0.9502	0.7355
0.3454	44.0	25960	1.1264	0.6930
0.3288	45.0	26550	1.5310	0.6440
0.3075	46.0	27140	1.0321	0.7067
0.326	47.0	27730	1.0041	0.7257
0.3035	48.0	28320	0.9984	0.7168
0.3318	49.0	28910	0.9336	0.7294
0.2923	50.0	29500	1.2029	0.6758
0.2813	51.0	30090	0.9525	0.7217
0.2844	52.0	30680	1.0021	0.7242
0.2706	53.0	31270	0.9836	0.7187
0.2748	54.0	31860	0.9966	0.7113
0.2585	55.0	32450	1.0029	0.7211
0.2603	56.0	33040	0.9700	0.7235
0.2442	57.0	33630	0.9675	0.7330
0.2503	58.0	34220	1.0088	0.7373
0.2473	59.0	34810	0.9043	0.7306
0.2503	60.0	35400	1.0069	0.7211
0.233	61.0	35990	1.0046	0.7245
0.2248	62.0	36580	1.0468	0.7217
0.2343	63.0	37170	0.9263	0.7202
0.2312	64.0	37760	1.1075	0.7101
0.2173	65.0	38350	1.0439	0.7205
0.2138	66.0	38940	1.1012	0.7364
0.2037	67.0	39530	1.0094	0.7336
0.2129	68.0	40120	0.9811	0.7275
0.1937	69.0	40710	1.0312	0.7419
0.2102	70.0	41300	1.0208	0.7318
0.2078	71.0	41890	1.0093	0.7174
0.2037	72.0	42480	1.1041	0.7404
0.1903	73.0	43070	0.9927	0.7318
0.1898	74.0	43660	1.0875	0.7431
0.1966	75.0	44250	0.9659	0.7257
0.1967	76.0	44840	1.0025	0.7254
0.191	77.0	45430	0.9488	0.7306
0.1916	78.0	46020	1.0042	0.7327
0.1819	79.0	46610	1.0258	0.7355
0.1794	80.0	47200	1.0124	0.7309
0.1773	81.0	47790	0.9920	0.7324
0.1852	82.0	48380	1.0088	0.7367
0.1809	83.0	48970	1.0702	0.7352
0.1695	84.0	49560	1.0249	0.7260
0.1704	85.0	50150	1.0086	0.7294
0.1698	86.0	50740	1.0465	0.7318
0.1609	87.0	51330	1.0387	0.7291
0.1654	88.0	51920	1.0260	0.7297
0.1589	89.0	52510	1.0342	0.7257
0.1624	90.0	53100	1.0773	0.7297
0.1633	91.0	53690	1.0567	0.7309
0.1593	92.0	54280	1.0176	0.7196
0.1558	93.0	54870	1.0428	0.7257
0.1536	94.0	55460	1.0158	0.7294
0.1559	95.0	56050	1.0159	0.7315
0.1577	96.0	56640	1.0299	0.7306
0.1518	97.0	57230	1.0132	0.7281
0.1477	98.0	57820	0.9931	0.7266
0.1529	99.0	58410	1.0248	0.7272
0.1445	100.0	59000	1.0109	0.7272

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_8e-3_10_0.1

1_8e-3_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_8e-3_10_0.1

Evaluation results