1_8e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9754
Accuracy: 0.7459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.008
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
3.0295	1.0	590	5.2308	0.6217
3.1648	2.0	1180	2.6673	0.3908
2.5921	3.0	1770	5.0497	0.3761
2.9042	4.0	2360	2.2586	0.6291
2.4411	5.0	2950	6.5105	0.6217
2.3131	6.0	3540	2.7244	0.5183
2.0563	7.0	4130	4.6938	0.3783
1.9468	8.0	4720	1.5045	0.6862
1.9269	9.0	5310	1.7666	0.6734
1.9701	10.0	5900	1.8173	0.6780
1.8231	11.0	6490	1.6929	0.6752
1.7563	12.0	7080	1.3455	0.6862
1.726	13.0	7670	1.2870	0.6786
1.6706	14.0	8260	1.3862	0.6951
1.5876	15.0	8850	1.4384	0.6587
1.5067	16.0	9440	1.5336	0.6985
1.5777	17.0	10030	1.9860	0.5972
1.4323	18.0	10620	1.2068	0.7076
1.4228	19.0	11210	1.8071	0.6780
1.4335	20.0	11800	4.1127	0.6346
1.4549	21.0	12390	1.2302	0.7131
1.277	22.0	12980	1.2829	0.6771
1.2962	23.0	13570	1.2152	0.7070
1.4076	24.0	14160	1.5758	0.6529
1.3427	25.0	14750	1.1333	0.6997
1.1936	26.0	15340	1.1974	0.6917
1.1937	27.0	15930	1.2653	0.6948
1.2784	28.0	16520	1.0620	0.7242
1.1605	29.0	17110	2.7859	0.6734
1.1438	30.0	17700	1.8633	0.6428
1.1406	31.0	18290	1.6275	0.7098
1.0993	32.0	18880	1.2765	0.6969
1.158	33.0	19470	1.1218	0.7058
1.0432	34.0	20060	1.0562	0.7245
1.0295	35.0	20650	1.3146	0.7251
1.0041	36.0	21240	1.0308	0.7150
1.0104	37.0	21830	1.0149	0.7242
1.0096	38.0	22420	1.1232	0.7083
0.9661	39.0	23010	1.0316	0.7251
0.9183	40.0	23600	1.2166	0.7055
0.9298	41.0	24190	1.9118	0.7040
0.8799	42.0	24780	1.0190	0.7306
0.954	43.0	25370	1.0761	0.7263
0.853	44.0	25960	1.2006	0.7080
1.0647	45.0	26550	1.1605	0.7379
0.8562	46.0	27140	1.2208	0.7122
0.8421	47.0	27730	0.9974	0.7388
0.7865	48.0	28320	1.1207	0.7376
0.8998	49.0	28910	1.1221	0.7080
0.8044	50.0	29500	1.0191	0.7205
0.7771	51.0	30090	0.9921	0.7364
0.7886	52.0	30680	1.1379	0.7419
0.7756	53.0	31270	1.3039	0.7315
0.7232	54.0	31860	1.1143	0.7385
0.69	55.0	32450	1.1024	0.7239
0.7313	56.0	33040	1.3560	0.7370
0.7266	57.0	33630	0.9763	0.7431
0.7084	58.0	34220	1.4480	0.7291
0.7072	59.0	34810	1.4463	0.7336
0.6889	60.0	35400	1.2983	0.7330
0.6745	61.0	35990	0.9898	0.7413
0.6739	62.0	36580	0.9817	0.7373
0.6513	63.0	37170	0.9999	0.7391
0.6665	64.0	37760	0.9840	0.7367
0.6428	65.0	38350	1.0120	0.7284
0.6418	66.0	38940	1.0021	0.7401
0.6185	67.0	39530	1.0063	0.7327
0.6259	68.0	40120	1.0108	0.7339
0.6165	69.0	40710	1.0279	0.7440
0.6393	70.0	41300	1.1899	0.7183
0.5869	71.0	41890	0.9767	0.7333
0.605	72.0	42480	1.4097	0.7367
0.5906	73.0	43070	1.0036	0.7358
0.5704	74.0	43660	1.3105	0.7443
0.5872	75.0	44250	1.0241	0.7242
0.5755	76.0	44840	1.1519	0.7410
0.5967	77.0	45430	1.1481	0.7431
0.57	78.0	46020	1.0164	0.7398
0.5599	79.0	46610	1.1657	0.7391
0.5458	80.0	47200	1.1020	0.7422
0.5299	81.0	47790	1.0836	0.7437
0.5285	82.0	48380	0.9682	0.7391
0.538	83.0	48970	1.1895	0.7193
0.5277	84.0	49560	0.9778	0.7459
0.525	85.0	50150	0.9893	0.7364
0.5268	86.0	50740	0.9745	0.7434
0.518	87.0	51330	0.9654	0.7450
0.5212	88.0	51920	0.9665	0.7382
0.5132	89.0	52510	1.0605	0.7474
0.5155	90.0	53100	0.9605	0.7440
0.4986	91.0	53690	1.0163	0.7480
0.5004	92.0	54280	1.0187	0.7312
0.4846	93.0	54870	0.9721	0.7440
0.4963	94.0	55460	1.0295	0.7468
0.4759	95.0	56050	1.0004	0.7468
0.4905	96.0	56640	1.0361	0.7474
0.4994	97.0	57230	0.9591	0.7446
0.4673	98.0	57820	0.9604	0.7431
0.4734	99.0	58410	0.9771	0.7462
0.4588	100.0	59000	0.9754	0.7459

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_8e-3_10_0.5

1_8e-3_10_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_8e-3_10_0.5

Evaluation results