1_7e-3_1_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.4732
Accuracy: 0.7462

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.007
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.9787	1.0	590	0.7825	0.6217
1.0111	2.0	1180	0.7676	0.6021
0.9238	3.0	1770	0.6005	0.6217
0.8313	4.0	2360	0.6038	0.4321
0.7671	5.0	2950	0.9066	0.6217
0.7472	6.0	3540	0.6074	0.4560
0.7577	7.0	4130	0.6978	0.3807
0.6835	8.0	4720	0.6612	0.6217
0.6855	9.0	5310	0.7161	0.6217
0.6572	10.0	5900	0.5321	0.6370
0.6389	11.0	6490	0.5122	0.6621
0.5993	12.0	7080	0.5795	0.6612
0.587	13.0	7670	0.5287	0.6245
0.5662	14.0	8260	0.4982	0.6664
0.5474	15.0	8850	0.5174	0.6453
0.5533	16.0	9440	0.5125	0.6890
0.5201	17.0	10030	0.4753	0.6716
0.5055	18.0	10620	0.4841	0.6755
0.4886	19.0	11210	0.4682	0.7028
0.4806	20.0	11800	0.4591	0.6905
0.456	21.0	12390	0.4729	0.6896
0.4627	22.0	12980	0.4434	0.7003
0.4301	23.0	13570	0.4426	0.7092
0.4203	24.0	14160	0.4324	0.7092
0.4175	25.0	14750	0.4642	0.7275
0.3993	26.0	15340	0.5582	0.6459
0.3972	27.0	15930	0.4367	0.7076
0.3812	28.0	16520	0.4484	0.7278
0.3726	29.0	17110	0.4581	0.7202
0.3781	30.0	17700	0.4322	0.7275
0.3578	31.0	18290	0.4970	0.7217
0.3458	32.0	18880	0.6182	0.7095
0.3434	33.0	19470	0.4644	0.7095
0.3338	34.0	20060	0.4355	0.7199
0.3344	35.0	20650	0.4495	0.7223
0.3308	36.0	21240	0.4515	0.7330
0.3208	37.0	21830	0.4562	0.7373
0.3012	38.0	22420	0.4464	0.7211
0.3055	39.0	23010	0.4410	0.7382
0.306	40.0	23600	0.5016	0.7343
0.2894	41.0	24190	0.4726	0.7364
0.2834	42.0	24780	0.4714	0.7379
0.2789	43.0	25370	0.4379	0.7199
0.2759	44.0	25960	0.4570	0.7287
0.2667	45.0	26550	0.4500	0.7294
0.2564	46.0	27140	0.4628	0.7413
0.2541	47.0	27730	0.4643	0.7379
0.2498	48.0	28320	0.4406	0.7336
0.2571	49.0	28910	0.4427	0.7373
0.2423	50.0	29500	0.4658	0.7315
0.2374	51.0	30090	0.4744	0.7214
0.2415	52.0	30680	0.5416	0.7373
0.2309	53.0	31270	0.4830	0.7226
0.2282	54.0	31860	0.4758	0.7343
0.2307	55.0	32450	0.4698	0.7266
0.2213	56.0	33040	0.4458	0.7446
0.2193	57.0	33630	0.4778	0.7382
0.214	58.0	34220	0.4828	0.7456
0.207	59.0	34810	0.4818	0.7294
0.21	60.0	35400	0.4614	0.7508
0.2118	61.0	35990	0.4507	0.7480
0.2031	62.0	36580	0.4718	0.7416
0.1987	63.0	37170	0.4752	0.7324
0.2018	64.0	37760	0.4431	0.7388
0.1889	65.0	38350	0.4769	0.7385
0.1941	66.0	38940	0.4623	0.7443
0.1898	67.0	39530	0.4818	0.7355
0.1872	68.0	40120	0.4678	0.7446
0.1813	69.0	40710	0.4843	0.7529
0.1893	70.0	41300	0.4702	0.7459
0.1885	71.0	41890	0.4931	0.7193
0.1811	72.0	42480	0.4854	0.7477
0.1755	73.0	43070	0.4848	0.7373
0.1768	74.0	43660	0.4867	0.7520
0.1728	75.0	44250	0.5011	0.7477
0.1791	76.0	44840	0.4876	0.7416
0.1733	77.0	45430	0.4920	0.7486
0.1745	78.0	46020	0.4711	0.7492
0.1741	79.0	46610	0.4661	0.7401
0.1706	80.0	47200	0.4670	0.7422
0.165	81.0	47790	0.4736	0.7459
0.1612	82.0	48380	0.4660	0.7459
0.1722	83.0	48970	0.4772	0.7410
0.1638	84.0	49560	0.4767	0.7434
0.1613	85.0	50150	0.4641	0.7391
0.1649	86.0	50740	0.4783	0.7450
0.1609	87.0	51330	0.4734	0.7453
0.1588	88.0	51920	0.4919	0.7508
0.1601	89.0	52510	0.4698	0.7453
0.1573	90.0	53100	0.4765	0.7508
0.1584	91.0	53690	0.4754	0.7492
0.1587	92.0	54280	0.4704	0.7413
0.1521	93.0	54870	0.4865	0.7505
0.1546	94.0	55460	0.4777	0.7505
0.1539	95.0	56050	0.4791	0.7526
0.1545	96.0	56640	0.4721	0.7456
0.1533	97.0	57230	0.4725	0.7407
0.1476	98.0	57820	0.4709	0.7462
0.1489	99.0	58410	0.4731	0.7459
0.1501	100.0	59000	0.4732	0.7462

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_7e-3_1_0.5

1_7e-3_1_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_7e-3_1_0.5

Evaluation results