1_5e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9516
Accuracy: 0.7450

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.4372	1.0	590	1.8593	0.6177
2.3953	2.0	1180	3.6910	0.3786
2.3694	3.0	1770	2.1033	0.4694
2.0494	4.0	2360	1.7694	0.6006
2.034	5.0	2950	1.7949	0.6355
1.8146	6.0	3540	1.7374	0.6159
1.896	7.0	4130	1.8850	0.5624
1.7794	8.0	4720	2.8405	0.6245
1.8298	9.0	5310	2.6985	0.4349
1.7892	10.0	5900	2.2049	0.6352
1.6916	11.0	6490	1.6606	0.6272
1.6384	12.0	7080	1.5955	0.6394
1.6382	13.0	7670	1.6722	0.6596
1.6078	14.0	8260	1.4874	0.6587
1.5373	15.0	8850	1.4382	0.6642
1.4655	16.0	9440	1.4120	0.6700
1.4354	17.0	10030	2.0067	0.6532
1.4021	18.0	10620	1.7860	0.5875
1.3537	19.0	11210	1.4043	0.6853
1.3638	20.0	11800	1.3726	0.6875
1.3061	21.0	12390	1.3332	0.6740
1.3052	22.0	12980	1.2831	0.6939
1.4056	23.0	13570	1.4235	0.6835
1.3389	24.0	14160	1.5395	0.6817
1.2294	25.0	14750	1.2364	0.6994
1.2213	26.0	15340	1.1806	0.7012
1.203	27.0	15930	1.3771	0.6538
1.1667	28.0	16520	1.3193	0.6820
1.1516	29.0	17110	1.3490	0.6621
1.1657	30.0	17700	1.1866	0.7015
1.1212	31.0	18290	1.2403	0.6991
1.0632	32.0	18880	1.1608	0.7138
1.0702	33.0	19470	1.3606	0.6642
1.0609	34.0	20060	1.1448	0.6972
1.0407	35.0	20650	1.2761	0.6838
1.0151	36.0	21240	2.0245	0.6862
1.0246	37.0	21830	1.0999	0.7012
0.9971	38.0	22420	1.1661	0.6997
0.9732	39.0	23010	1.1978	0.7187
0.9642	40.0	23600	1.0760	0.7245
0.9628	41.0	24190	1.2119	0.7223
0.9605	42.0	24780	1.0589	0.7245
0.9297	43.0	25370	1.0496	0.7297
0.9282	44.0	25960	1.0384	0.7324
0.8927	45.0	26550	1.0954	0.7284
0.8753	46.0	27140	1.0344	0.7343
0.8787	47.0	27730	1.0238	0.7162
0.8397	48.0	28320	1.0650	0.7162
0.9109	49.0	28910	1.0901	0.7297
0.8609	50.0	29500	1.0152	0.7300
0.823	51.0	30090	1.1109	0.7128
0.8029	52.0	30680	1.0899	0.7113
0.8142	53.0	31270	1.0185	0.7339
0.7967	54.0	31860	0.9917	0.7336
0.7919	55.0	32450	1.0096	0.7352
0.7883	56.0	33040	1.0033	0.7355
0.7794	57.0	33630	1.0478	0.7336
0.7444	58.0	34220	1.0485	0.7284
0.7646	59.0	34810	1.0046	0.7242
0.7493	60.0	35400	0.9997	0.7300
0.7126	61.0	35990	0.9838	0.7398
0.7303	62.0	36580	0.9983	0.7300
0.7184	63.0	37170	1.1151	0.7156
0.711	64.0	37760	1.0758	0.7220
0.6963	65.0	38350	0.9884	0.7281
0.6972	66.0	38940	0.9688	0.7336
0.6927	67.0	39530	0.9794	0.7339
0.6923	68.0	40120	0.9681	0.7379
0.6829	69.0	40710	1.0167	0.7440
0.6705	70.0	41300	0.9709	0.7358
0.6717	71.0	41890	1.0276	0.7226
0.6683	72.0	42480	0.9858	0.7324
0.6405	73.0	43070	0.9954	0.7336
0.6423	74.0	43660	0.9730	0.7339
0.6628	75.0	44250	1.0100	0.7388
0.6528	76.0	44840	0.9663	0.7398
0.6327	77.0	45430	0.9619	0.7358
0.6434	78.0	46020	0.9671	0.7361
0.6261	79.0	46610	0.9778	0.7248
0.6312	80.0	47200	0.9802	0.7343
0.6098	81.0	47790	0.9736	0.7431
0.6221	82.0	48380	0.9820	0.7330
0.6166	83.0	48970	0.9587	0.7431
0.6072	84.0	49560	0.9671	0.7370
0.5986	85.0	50150	0.9629	0.7385
0.5959	86.0	50740	0.9576	0.7407
0.5858	87.0	51330	0.9793	0.7428
0.5846	88.0	51920	0.9722	0.7404
0.5879	89.0	52510	0.9822	0.7394
0.582	90.0	53100	0.9625	0.7422
0.5805	91.0	53690	0.9856	0.7443
0.5767	92.0	54280	0.9560	0.7404
0.5711	93.0	54870	0.9629	0.7440
0.5769	94.0	55460	0.9560	0.7431
0.557	95.0	56050	0.9562	0.7434
0.5706	96.0	56640	0.9565	0.7440
0.5691	97.0	57230	0.9515	0.7425
0.5496	98.0	57820	0.9570	0.7410
0.5643	99.0	58410	0.9512	0.7434
0.5539	100.0	59000	0.9516	0.7450

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_5_0.5

1_5e-3_5_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_5_0.5

Evaluation results