1_5e-3_10_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9486
Accuracy: 0.7520

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
4.2553	1.0	590	3.4885	0.6217
3.8771	2.0	1180	5.2589	0.4156
3.8841	3.0	1770	3.1457	0.6217
3.4978	4.0	2360	3.6630	0.5073
3.4514	5.0	2950	2.8535	0.6538
2.8512	6.0	3540	4.5431	0.6401
2.8629	7.0	4130	2.9999	0.5774
2.7803	8.0	4720	4.0455	0.6440
2.3648	9.0	5310	3.4814	0.6618
2.3135	10.0	5900	1.8693	0.6985
2.2615	11.0	6490	1.7206	0.7095
1.938	12.0	7080	2.2772	0.6664
1.9168	13.0	7670	1.5057	0.7012
1.7411	14.0	8260	1.4510	0.7239
1.7184	15.0	8850	1.3241	0.7211
1.5774	16.0	9440	1.8563	0.7153
1.5229	17.0	10030	1.3243	0.7226
1.4652	18.0	10620	1.3866	0.7333
1.4321	19.0	11210	1.2208	0.7294
1.4205	20.0	11800	1.4391	0.7080
1.3537	21.0	12390	1.2900	0.7382
1.3302	22.0	12980	1.2322	0.7398
1.2616	23.0	13570	1.2189	0.7391
1.2586	24.0	14160	1.1687	0.7410
1.2259	25.0	14750	1.1797	0.7336
1.1804	26.0	15340	1.0929	0.7394
1.1907	27.0	15930	1.2820	0.7168
1.2066	28.0	16520	1.2464	0.7422
1.1128	29.0	17110	1.1798	0.7180
1.0889	30.0	17700	1.1373	0.7474
1.0637	31.0	18290	1.0453	0.7382
1.058	32.0	18880	1.1689	0.7446
1.0553	33.0	19470	1.0705	0.7321
1.0404	34.0	20060	1.0731	0.7425
1.014	35.0	20650	1.0481	0.7459
1.0166	36.0	21240	1.0434	0.7508
0.9983	37.0	21830	1.1358	0.7471
1.0144	38.0	22420	1.0030	0.7425
1.0236	39.0	23010	1.2874	0.7437
0.9749	40.0	23600	1.3199	0.7370
0.9592	41.0	24190	1.0072	0.7352
0.9467	42.0	24780	1.0282	0.7422
0.921	43.0	25370	1.3284	0.7446
0.9328	44.0	25960	0.9873	0.7364
0.9192	45.0	26550	1.3185	0.7425
0.8882	46.0	27140	0.9961	0.7453
0.8986	47.0	27730	0.9880	0.7373
0.8635	48.0	28320	1.0019	0.7480
0.8988	49.0	28910	1.1254	0.7498
0.865	50.0	29500	0.9619	0.7468
0.8575	51.0	30090	1.0854	0.7502
0.8654	52.0	30680	0.9466	0.7462
0.8482	53.0	31270	1.0722	0.7483
0.8547	54.0	31860	1.1340	0.7492
0.8424	55.0	32450	1.0683	0.7462
0.8078	56.0	33040	1.0285	0.7495
0.8163	57.0	33630	0.9779	0.7502
0.8175	58.0	34220	0.9461	0.7505
0.816	59.0	34810	0.9991	0.7443
0.8123	60.0	35400	0.9554	0.7443
0.7827	61.0	35990	0.9765	0.7492
0.8139	62.0	36580	1.1876	0.7547
0.7938	63.0	37170	0.9484	0.7541
0.7712	64.0	37760	0.9400	0.7508
0.7834	65.0	38350	0.9793	0.7532
0.781	66.0	38940	0.9480	0.7498
0.7639	67.0	39530	1.1188	0.7593
0.7838	68.0	40120	1.0215	0.7541
0.7527	69.0	40710	1.0855	0.7529
0.7626	70.0	41300	1.0755	0.7526
0.7683	71.0	41890	0.9553	0.7566
0.7588	72.0	42480	0.9822	0.7581
0.7377	73.0	43070	1.0359	0.7557
0.731	74.0	43660	0.9513	0.7505
0.7536	75.0	44250	1.1317	0.7505
0.7449	76.0	44840	0.9001	0.7532
0.7428	77.0	45430	1.0150	0.7538
0.7271	78.0	46020	0.9623	0.7563
0.7383	79.0	46610	0.9535	0.7584
0.7186	80.0	47200	0.9970	0.7581
0.6823	81.0	47790	1.0485	0.7563
0.7259	82.0	48380	0.9706	0.7526
0.7039	83.0	48970	0.9543	0.7480
0.7259	84.0	49560	0.9387	0.7508
0.7092	85.0	50150	0.9828	0.7538
0.7259	86.0	50740	0.9145	0.7459
0.7195	87.0	51330	0.9313	0.7495
0.696	88.0	51920	0.9467	0.7492
0.6885	89.0	52510	0.9671	0.7526
0.6874	90.0	53100	0.9387	0.7511
0.6911	91.0	53690	1.0279	0.7492
0.6968	92.0	54280	0.9268	0.7511
0.6833	93.0	54870	0.9886	0.7517
0.7096	94.0	55460	0.9693	0.7532
0.6911	95.0	56050	0.9503	0.7547
0.6754	96.0	56640	0.9451	0.7544
0.6823	97.0	57230	0.9427	0.7535
0.6547	98.0	57820	0.9500	0.7526
0.6433	99.0	58410	0.9280	0.7505
0.6722	100.0	59000	0.9486	0.7520

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_10_0.9

1_5e-3_10_0.9

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_10_0.9

Evaluation results