1_8e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.5223
Accuracy: 0.7101

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.008
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.047	1.0	590	0.5930	0.6147
1.1566	2.0	1180	0.8138	0.3786
0.8071	3.0	1770	1.1906	0.6217
0.8515	4.0	2360	0.5963	0.5232
0.7727	5.0	2950	0.5584	0.6043
0.864	6.0	3540	1.9242	0.3783
0.7792	7.0	4130	0.7053	0.5116
0.768	8.0	4720	2.9011	0.3783
0.7931	9.0	5310	0.6747	0.3783
0.726	10.0	5900	5.3441	0.3783
0.7177	11.0	6490	0.7048	0.3783
0.6681	12.0	7080	0.6229	0.3783
0.6889	13.0	7670	1.0114	0.6205
0.6618	14.0	8260	2.8718	0.6217
0.6566	15.0	8850	1.5485	0.6217
0.6227	16.0	9440	0.7295	0.6220
0.6016	17.0	10030	0.6356	0.6217
0.5891	18.0	10620	0.9814	0.6266
0.5534	19.0	11210	1.4086	0.6205
0.5574	20.0	11800	1.9522	0.6211
0.5349	21.0	12390	0.5543	0.6355
0.5171	22.0	12980	0.5258	0.6780
0.5043	23.0	13570	0.7235	0.4746
0.4775	24.0	14160	0.5588	0.6428
0.4721	25.0	14750	0.5342	0.6731
0.461	26.0	15340	0.7023	0.5560
0.461	27.0	15930	1.0768	0.4144
0.4312	28.0	16520	0.5149	0.6798
0.4378	29.0	17110	0.8702	0.5226
0.4214	30.0	17700	0.8323	0.6514
0.4205	31.0	18290	0.4795	0.6869
0.3944	32.0	18880	0.4763	0.6969
0.3874	33.0	19470	1.5854	0.6248
0.3779	34.0	20060	0.5091	0.6914
0.3723	35.0	20650	0.7588	0.6541
0.3693	36.0	21240	0.7886	0.5128
0.3602	37.0	21830	1.4420	0.4719
0.3522	38.0	22420	0.9082	0.5073
0.3488	39.0	23010	0.6001	0.6853
0.3348	40.0	23600	0.6879	0.6492
0.3482	41.0	24190	1.7803	0.6315
0.3324	42.0	24780	0.5648	0.6997
0.3318	43.0	25370	0.9623	0.6618
0.336	44.0	25960	0.6179	0.6459
0.3167	45.0	26550	0.5041	0.6997
0.3069	46.0	27140	0.4954	0.7003
0.3078	47.0	27730	0.5356	0.7028
0.2981	48.0	28320	1.3955	0.6450
0.3037	49.0	28910	0.5689	0.6878
0.2887	50.0	29500	0.8592	0.5517
0.28	51.0	30090	0.5939	0.6838
0.2786	52.0	30680	0.6514	0.6765
0.2778	53.0	31270	1.8380	0.6339
0.2797	54.0	31860	1.1076	0.6440
0.2773	55.0	32450	0.4983	0.6972
0.2746	56.0	33040	1.5742	0.4483
0.2691	57.0	33630	0.8767	0.6498
0.2555	58.0	34220	0.6028	0.6113
0.2675	59.0	34810	0.7268	0.6664
0.2567	60.0	35400	0.5953	0.6593
0.2555	61.0	35990	0.5564	0.6795
0.2525	62.0	36580	0.7419	0.6009
0.2451	63.0	37170	0.5019	0.7043
0.2431	64.0	37760	0.5603	0.6997
0.2373	65.0	38350	0.5755	0.6612
0.2387	66.0	38940	0.6158	0.6254
0.2433	67.0	39530	0.5994	0.6150
0.2354	68.0	40120	0.5195	0.7101
0.2361	69.0	40710	0.5164	0.7076
0.234	70.0	41300	0.5001	0.6997
0.2341	71.0	41890	1.0352	0.4728
0.2245	72.0	42480	0.5045	0.7073
0.2219	73.0	43070	0.5208	0.7080
0.216	74.0	43660	0.5116	0.7061
0.2227	75.0	44250	0.5224	0.7089
0.2163	76.0	44840	0.6881	0.5960
0.217	77.0	45430	0.5131	0.7
0.2209	78.0	46020	0.5344	0.7086
0.2094	79.0	46610	0.6909	0.6098
0.21	80.0	47200	0.7910	0.5829
0.2069	81.0	47790	0.7681	0.6575
0.2021	82.0	48380	0.5345	0.7083
0.2077	83.0	48970	0.5224	0.7043
0.2002	84.0	49560	0.5126	0.7015
0.2033	85.0	50150	0.5920	0.7003
0.2021	86.0	50740	0.5589	0.7040
0.1873	87.0	51330	0.5470	0.7101
0.1972	88.0	51920	0.5276	0.7040
0.1855	89.0	52510	0.5280	0.7049
0.1916	90.0	53100	0.5261	0.7046
0.1912	91.0	53690	0.5950	0.6569
0.1917	92.0	54280	0.5402	0.6850
0.1879	93.0	54870	0.5765	0.7037
0.1923	94.0	55460	0.5297	0.6991
0.1894	95.0	56050	0.5150	0.7083
0.1853	96.0	56640	0.5276	0.6976
0.1848	97.0	57230	0.5356	0.7113
0.1796	98.0	57820	0.5585	0.7086
0.1848	99.0	58410	0.5230	0.7101
0.1849	100.0	59000	0.5223	0.7101

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_8e-3_1_0.5

1_8e-3_1_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_8e-3_1_0.5

Evaluation results