1_9e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 1.0354
Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.009
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.6659	1.0	590	1.0802	0.3801
1.744	2.0	1180	1.0364	0.5086
1.4165	3.0	1770	1.0525	0.4324
1.3808	4.0	2360	1.2296	0.6217
1.307	5.0	2950	3.0278	0.3835
1.228	6.0	3540	1.1153	0.6489
1.2785	7.0	4130	2.8946	0.4211
1.1321	8.0	4720	0.9307	0.6416
1.0781	9.0	5310	0.8861	0.6914
1.0489	10.0	5900	2.3977	0.6220
0.9691	11.0	6490	0.8622	0.6609
1.012	12.0	7080	0.7911	0.7031
0.9394	13.0	7670	0.7907	0.7086
0.9733	14.0	8260	1.6734	0.4859
0.8923	15.0	8850	1.1847	0.5654
0.8492	16.0	9440	0.9835	0.7116
0.8235	17.0	10030	1.1283	0.6428
0.7418	18.0	10620	0.9441	0.6832
0.8598	19.0	11210	0.7886	0.7190
0.7646	20.0	11800	0.7994	0.7211
0.6827	21.0	12390	0.8823	0.7122
0.6563	22.0	12980	1.1212	0.6364
0.6387	23.0	13570	0.8303	0.7113
0.6676	24.0	14160	1.3662	0.6251
0.598	25.0	14750	1.0796	0.6474
0.5547	26.0	15340	0.9681	0.6835
0.5539	27.0	15930	0.8656	0.7055
0.542	28.0	16520	1.0407	0.6688
0.519	29.0	17110	1.0368	0.7223
0.5087	30.0	17700	1.4459	0.7110
0.5462	31.0	18290	0.8618	0.7324
0.4592	32.0	18880	1.0897	0.7168
0.4374	33.0	19470	0.9626	0.7107
0.4665	34.0	20060	0.9022	0.7379
0.4086	35.0	20650	0.8794	0.7339
0.4042	36.0	21240	1.2955	0.7153
0.4267	37.0	21830	1.0492	0.7275
0.3928	38.0	22420	0.8772	0.7306
0.3777	39.0	23010	0.9378	0.7193
0.3693	40.0	23600	1.3226	0.6832
0.3782	41.0	24190	1.3153	0.7284
0.3429	42.0	24780	0.9722	0.7171
0.3359	43.0	25370	1.0545	0.7321
0.3431	44.0	25960	0.9919	0.7321
0.326	45.0	26550	0.8933	0.7202
0.3004	46.0	27140	1.0468	0.7361
0.3233	47.0	27730	1.0189	0.7318
0.3045	48.0	28320	1.3587	0.6740
0.3399	49.0	28910	1.0820	0.7092
0.2913	50.0	29500	1.2963	0.6835
0.2956	51.0	30090	0.9578	0.7324
0.2839	52.0	30680	1.0030	0.7437
0.2701	53.0	31270	1.1058	0.7245
0.2561	54.0	31860	1.0679	0.7156
0.2644	55.0	32450	1.0564	0.7388
0.2711	56.0	33040	1.1395	0.7193
0.2311	57.0	33630	1.0809	0.7434
0.2533	58.0	34220	1.0640	0.7450
0.2536	59.0	34810	1.0119	0.7468
0.2427	60.0	35400	1.0311	0.7266
0.2354	61.0	35990	1.0316	0.7346
0.223	62.0	36580	1.0253	0.7450
0.2257	63.0	37170	1.0761	0.7391
0.223	64.0	37760	1.0619	0.7388
0.2319	65.0	38350	0.9937	0.7443
0.2287	66.0	38940	1.1042	0.7413
0.2105	67.0	39530	1.0410	0.7404
0.2109	68.0	40120	0.9820	0.7343
0.2012	69.0	40710	1.0243	0.7456
0.2035	70.0	41300	1.0944	0.7434
0.2039	71.0	41890	1.0195	0.7346
0.201	72.0	42480	1.1017	0.7431
0.1952	73.0	43070	1.1423	0.7254
0.1837	74.0	43660	1.0600	0.7391
0.1891	75.0	44250	1.0447	0.7437
0.1885	76.0	44840	1.0443	0.7471
0.1928	77.0	45430	1.0006	0.7437
0.1952	78.0	46020	1.0411	0.7453
0.1787	79.0	46610	1.0275	0.7413
0.1701	80.0	47200	1.0867	0.7272
0.1654	81.0	47790	1.0261	0.7330
0.1808	82.0	48380	1.0537	0.7339
0.1794	83.0	48970	1.0808	0.7456
0.1671	84.0	49560	1.0418	0.7404
0.1668	85.0	50150	1.0140	0.7407
0.1726	86.0	50740	1.0860	0.7456
0.1643	87.0	51330	1.0581	0.7352
0.1596	88.0	51920	1.0603	0.7349
0.1612	89.0	52510	1.0412	0.7422
0.1563	90.0	53100	1.0482	0.7401
0.1567	91.0	53690	1.1036	0.7431
0.1601	92.0	54280	1.0126	0.7388
0.1566	93.0	54870	1.0497	0.7352
0.1558	94.0	55460	1.0246	0.7388
0.1518	95.0	56050	1.0406	0.7413
0.1503	96.0	56640	1.0261	0.7425
0.1523	97.0	57230	1.0411	0.7370
0.1426	98.0	57820	1.0398	0.7416
0.1465	99.0	58410	1.0459	0.7388
0.1388	100.0	59000	1.0354	0.7401

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_9e-3_10_0.1

1_9e-3_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_9e-3_10_0.1

Evaluation results