2_5e-3_20_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.6966
Accuracy: 0.7407

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.2231	1.0	590	1.2340	0.3789
1.2295	2.0	1180	1.2339	0.3798
1.0628	3.0	1770	1.3715	0.3823
1.0865	4.0	2360	1.9743	0.3783
1.0975	5.0	2950	0.9219	0.5908
0.9667	6.0	3540	0.8883	0.6465
0.9542	7.0	4130	1.1371	0.5211
0.9021	8.0	4720	0.8855	0.6703
0.8629	9.0	5310	0.8316	0.6841
0.824	10.0	5900	0.9914	0.6596
0.8085	11.0	6490	0.8443	0.6908
0.7644	12.0	7080	0.8058	0.6706
0.765	13.0	7670	0.7726	0.7
0.7438	14.0	8260	0.8309	0.6887
0.7459	15.0	8850	0.7637	0.7018
0.717	16.0	9440	0.8887	0.6254
0.6932	17.0	10030	0.7578	0.6991
0.7052	18.0	10620	0.7760	0.7049
0.6814	19.0	11210	0.7195	0.7162
0.7066	20.0	11800	0.7185	0.7239
0.6685	21.0	12390	0.7384	0.7196
0.673	22.0	12980	0.7108	0.7239
0.6678	23.0	13570	0.7177	0.7260
0.6494	24.0	14160	0.6995	0.7248
0.6415	25.0	14750	0.7502	0.7336
0.6456	26.0	15340	0.7096	0.7205
0.6303	27.0	15930	0.7382	0.7061
0.6168	28.0	16520	0.7049	0.7379
0.6076	29.0	17110	0.7018	0.7232
0.6083	30.0	17700	0.7522	0.7190
0.5955	31.0	18290	0.6889	0.7306
0.5929	32.0	18880	0.7513	0.7281
0.5827	33.0	19470	0.6930	0.7446
0.5727	34.0	20060	0.6848	0.7355
0.5557	35.0	20650	0.7043	0.7260
0.572	36.0	21240	0.6876	0.7367
0.5564	37.0	21830	0.6957	0.7394
0.5454	38.0	22420	0.7031	0.7275
0.5471	39.0	23010	0.6980	0.7367
0.5323	40.0	23600	0.7033	0.7382
0.5439	41.0	24190	0.7215	0.7205
0.5332	42.0	24780	0.6841	0.7401
0.5275	43.0	25370	0.6904	0.7413
0.5263	44.0	25960	0.7266	0.7248
0.5238	45.0	26550	0.6961	0.7428
0.5165	46.0	27140	0.7033	0.7330
0.5126	47.0	27730	0.6928	0.7425
0.5148	48.0	28320	0.6859	0.7413
0.5141	49.0	28910	0.6945	0.7379
0.4973	50.0	29500	0.6952	0.7391
0.5043	51.0	30090	0.6954	0.7364
0.4966	52.0	30680	0.6890	0.7376
0.4967	53.0	31270	0.6937	0.7428
0.4974	54.0	31860	0.7009	0.7370
0.4977	55.0	32450	0.6961	0.7398
0.4948	56.0	33040	0.6986	0.7391
0.479	57.0	33630	0.6919	0.7407
0.4835	58.0	34220	0.6965	0.7440
0.4811	59.0	34810	0.6962	0.7419
0.485	60.0	35400	0.6966	0.7407

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_5e-3_20_0.1

2_5e-3_20_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_5e-3_20_0.1

Evaluation results