1_9e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.8603
Accuracy: 0.7489

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.009
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.7616	1.0	590	2.7583	0.3798
2.2507	2.0	1180	1.8432	0.6294
2.5953	3.0	1770	3.4928	0.4532
2.3305	4.0	2360	1.5737	0.6486
1.9577	5.0	2950	2.6604	0.6263
1.7557	6.0	3540	1.2734	0.6761
1.6227	7.0	4130	3.4140	0.5119
1.4961	8.0	4720	1.2029	0.7043
1.3331	9.0	5310	1.2170	0.7092
1.3007	10.0	5900	1.7625	0.6725
1.2049	11.0	6490	1.0667	0.7070
1.1087	12.0	7080	0.9915	0.7156
1.1023	13.0	7670	1.0683	0.6924
1.0404	14.0	8260	1.1711	0.7248
1.0287	15.0	8850	1.0966	0.7297
0.9405	16.0	9440	0.9352	0.7107
0.8558	17.0	10030	0.9269	0.7205
0.8273	18.0	10620	0.9574	0.7235
0.7798	19.0	11210	0.9598	0.7385
0.7646	20.0	11800	0.9004	0.7287
0.7505	21.0	12390	0.9389	0.7174
0.7273	22.0	12980	0.9234	0.7358
0.6971	23.0	13570	0.9055	0.7315
0.6815	24.0	14160	0.8711	0.7352
0.6729	25.0	14750	1.0923	0.7437
0.6151	26.0	15340	0.8950	0.7254
0.6291	27.0	15930	1.1086	0.6945
0.6243	28.0	16520	0.9179	0.7410
0.609	29.0	17110	1.0778	0.7410
0.5733	30.0	17700	0.9548	0.7422
0.5742	31.0	18290	1.1436	0.7413
0.5675	32.0	18880	0.8956	0.7450
0.5578	33.0	19470	0.9040	0.7382
0.5339	34.0	20060	0.8730	0.7453
0.5284	35.0	20650	1.0258	0.7486
0.5116	36.0	21240	1.2775	0.7382
0.5215	37.0	21830	0.9275	0.7477
0.5038	38.0	22420	0.8780	0.7394
0.5073	39.0	23010	0.9095	0.7468
0.4897	40.0	23600	0.8864	0.7410
0.4927	41.0	24190	1.1312	0.7391
0.4941	42.0	24780	0.8809	0.7339
0.4629	43.0	25370	1.1564	0.7419
0.4754	44.0	25960	0.9223	0.7413
0.457	45.0	26550	0.8677	0.7422
0.4398	46.0	27140	1.0571	0.7471
0.4612	47.0	27730	0.8773	0.7401
0.4464	48.0	28320	0.9260	0.7477
0.4779	49.0	28910	0.8712	0.7425
0.443	50.0	29500	0.8886	0.7413
0.4445	51.0	30090	0.8968	0.7431
0.4274	52.0	30680	0.9516	0.7495
0.4239	53.0	31270	0.8773	0.7443
0.4143	54.0	31860	1.0295	0.7401
0.4359	55.0	32450	0.8879	0.7453
0.4197	56.0	33040	0.8712	0.7489
0.397	57.0	33630	1.0037	0.7544
0.402	58.0	34220	0.8789	0.7554
0.4015	59.0	34810	0.8532	0.7523
0.4008	60.0	35400	0.8840	0.7523
0.3943	61.0	35990	0.9475	0.7462
0.3968	62.0	36580	0.9413	0.7465
0.394	63.0	37170	0.8878	0.7480
0.3914	64.0	37760	0.8737	0.7511
0.3959	65.0	38350	0.8553	0.7486
0.3881	66.0	38940	0.8905	0.7495
0.379	67.0	39530	0.8956	0.7489
0.3821	68.0	40120	0.8711	0.7514
0.3764	69.0	40710	0.9552	0.7557
0.3841	70.0	41300	0.9638	0.7523
0.3758	71.0	41890	0.8728	0.7453
0.376	72.0	42480	0.9654	0.7450
0.364	73.0	43070	1.0121	0.7477
0.3567	74.0	43660	1.0070	0.7508
0.3723	75.0	44250	0.9271	0.7508
0.3673	76.0	44840	0.8824	0.7450
0.3656	77.0	45430	0.8812	0.7477
0.3722	78.0	46020	0.8728	0.7502
0.3719	79.0	46610	0.8551	0.7465
0.3502	80.0	47200	0.8913	0.7523
0.3467	81.0	47790	0.8476	0.7489
0.348	82.0	48380	0.8885	0.7517
0.3498	83.0	48970	0.8690	0.7443
0.3457	84.0	49560	0.8824	0.7480
0.3463	85.0	50150	0.8450	0.7453
0.3465	86.0	50740	0.8760	0.7459
0.3418	87.0	51330	0.8702	0.7437
0.3394	88.0	51920	0.8782	0.7434
0.3371	89.0	52510	0.8950	0.7474
0.3309	90.0	53100	0.8568	0.7398
0.3321	91.0	53690	0.8973	0.7495
0.3385	92.0	54280	0.8401	0.7431
0.3264	93.0	54870	0.8658	0.7462
0.3382	94.0	55460	0.8652	0.7483
0.3279	95.0	56050	0.8785	0.7465
0.3274	96.0	56640	0.8666	0.7477
0.3272	97.0	57230	0.8666	0.7489
0.3147	98.0	57820	0.8641	0.7498
0.3172	99.0	58410	0.8616	0.7486
0.3256	100.0	59000	0.8603	0.7489

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_9e-3_5_0.5

1_9e-3_5_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_9e-3_5_0.5

Evaluation results