1_7e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.7994
Accuracy: 0.7590

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.007
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.4468	1.0	590	2.2373	0.6183
2.615	2.0	1180	1.8655	0.5557
2.2782	3.0	1770	1.8976	0.5260
1.7962	4.0	2360	1.7110	0.5746
1.6241	5.0	2950	1.4946	0.6801
1.4269	6.0	3540	1.3572	0.6972
1.4106	7.0	4130	1.3887	0.6394
1.3024	8.0	4720	1.2780	0.6966
1.2769	9.0	5310	1.1492	0.6896
1.1959	10.0	5900	1.4278	0.6936
1.1842	11.0	6490	1.0641	0.7156
1.103	12.0	7080	1.0075	0.7232
1.0823	13.0	7670	1.0099	0.7086
1.0542	14.0	8260	1.0171	0.7294
1.0489	15.0	8850	0.9553	0.7297
1.0048	16.0	9440	0.9329	0.7336
0.9169	17.0	10030	0.9543	0.7321
0.9179	18.0	10620	0.9167	0.7327
0.8928	19.0	11210	0.9433	0.7404
0.8929	20.0	11800	1.0377	0.7346
0.8262	21.0	12390	0.8871	0.7440
0.8508	22.0	12980	0.9002	0.7434
0.8101	23.0	13570	0.8907	0.7471
0.7787	24.0	14160	0.8993	0.7471
0.7706	25.0	14750	0.8341	0.7440
0.7485	26.0	15340	0.8837	0.7376
0.7498	27.0	15930	0.8711	0.7385
0.7175	28.0	16520	0.9197	0.7495
0.7034	29.0	17110	0.8367	0.7434
0.685	30.0	17700	0.8322	0.7459
0.6718	31.0	18290	0.8840	0.7474
0.6746	32.0	18880	0.8978	0.7492
0.6579	33.0	19470	0.8499	0.7456
0.6305	34.0	20060	0.8291	0.7480
0.6316	35.0	20650	0.8555	0.7385
0.6198	36.0	21240	0.8694	0.7557
0.616	37.0	21830	0.8268	0.7599
0.6331	38.0	22420	0.8227	0.7505
0.6077	39.0	23010	0.9053	0.7554
0.5947	40.0	23600	0.9019	0.7554
0.5773	41.0	24190	0.8128	0.7584
0.57	42.0	24780	0.8028	0.7609
0.5686	43.0	25370	0.8444	0.7621
0.564	44.0	25960	0.8285	0.7459
0.5584	45.0	26550	0.8303	0.7544
0.5408	46.0	27140	0.8650	0.7560
0.54	47.0	27730	0.8684	0.7370
0.528	48.0	28320	0.8171	0.7581
0.5499	49.0	28910	0.8792	0.7550
0.5295	50.0	29500	0.8192	0.7578
0.5138	51.0	30090	0.8493	0.7578
0.516	52.0	30680	0.8111	0.7581
0.5066	53.0	31270	0.8026	0.7514
0.5061	54.0	31860	0.8134	0.7609
0.5061	55.0	32450	0.8229	0.7618
0.4903	56.0	33040	0.8253	0.7590
0.4876	57.0	33630	0.8467	0.7596
0.4842	58.0	34220	0.8295	0.7566
0.4743	59.0	34810	0.8587	0.7385
0.484	60.0	35400	0.7973	0.7550
0.4686	61.0	35990	0.8244	0.7593
0.4734	62.0	36580	0.8127	0.7615
0.4655	63.0	37170	0.8271	0.7529
0.457	64.0	37760	0.7995	0.7544
0.4643	65.0	38350	0.8315	0.7642
0.4535	66.0	38940	0.8044	0.7575
0.4445	67.0	39530	0.8785	0.7602
0.4546	68.0	40120	0.7933	0.7587
0.4427	69.0	40710	0.8548	0.7602
0.4441	70.0	41300	0.8274	0.7627
0.4514	71.0	41890	0.7980	0.7495
0.4468	72.0	42480	0.8562	0.7572
0.415	73.0	43070	0.8126	0.7636
0.4225	74.0	43660	0.8120	0.7596
0.4372	75.0	44250	0.8545	0.7602
0.4295	76.0	44840	0.8148	0.7462
0.4351	77.0	45430	0.8043	0.7642
0.4379	78.0	46020	0.7927	0.7621
0.4282	79.0	46610	0.7931	0.7624
0.4169	80.0	47200	0.8081	0.7596
0.4142	81.0	47790	0.8231	0.7602
0.4149	82.0	48380	0.8266	0.7602
0.409	83.0	48970	0.8020	0.7593
0.4084	84.0	49560	0.8396	0.7621
0.4012	85.0	50150	0.8049	0.7606
0.4056	86.0	50740	0.7971	0.7566
0.3991	87.0	51330	0.8462	0.7599
0.4019	88.0	51920	0.8056	0.7569
0.394	89.0	52510	0.8047	0.7554
0.3985	90.0	53100	0.8150	0.7609
0.3978	91.0	53690	0.8178	0.7606
0.4036	92.0	54280	0.7915	0.7560
0.3859	93.0	54870	0.8072	0.7599
0.4053	94.0	55460	0.8112	0.7606
0.3889	95.0	56050	0.8010	0.7587
0.3866	96.0	56640	0.8017	0.7578
0.3806	97.0	57230	0.7965	0.7584
0.3816	98.0	57820	0.7979	0.7590
0.3791	99.0	58410	0.7982	0.7575
0.3782	100.0	59000	0.7994	0.7590

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_7e-3_5_0.5

1_7e-3_5_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_7e-3_5_0.5

Evaluation results