2_9e-3_10_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.9490
Accuracy: 0.7434

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.009
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.4195	1.0	590	2.4975	0.3783
2.2824	2.0	1180	1.9145	0.6012
2.1458	3.0	1770	2.3359	0.6217
2.1747	4.0	2360	2.1157	0.6535
1.9504	5.0	2950	1.5636	0.6502
1.7882	6.0	3540	1.6203	0.6315
1.6871	7.0	4130	1.4819	0.6394
1.6471	8.0	4720	2.7794	0.6217
1.7323	9.0	5310	4.0220	0.6462
1.5353	10.0	5900	1.6458	0.6789
1.5678	11.0	6490	1.1800	0.7043
1.3291	12.0	7080	1.2374	0.7165
1.4272	13.0	7670	1.1377	0.7110
1.3034	14.0	8260	1.1466	0.7183
1.2451	15.0	8850	1.2199	0.7177
1.2807	16.0	9440	1.0946	0.7272
1.2129	17.0	10030	1.1599	0.7073
1.1857	18.0	10620	1.0682	0.7248
1.1625	19.0	11210	1.2619	0.7272
1.0859	20.0	11800	1.0746	0.7349
1.1021	21.0	12390	1.0435	0.7287
1.0416	22.0	12980	1.3806	0.7312
1.0426	23.0	13570	1.2656	0.7330
1.0436	24.0	14160	1.1256	0.7034
1.0052	25.0	14750	1.7754	0.7232
1.0031	26.0	15340	1.0313	0.7211
0.9812	27.0	15930	1.0008	0.7373
0.9123	28.0	16520	0.9610	0.7361
0.9127	29.0	17110	0.9778	0.7410
0.9232	30.0	17700	1.0516	0.7388
0.899	31.0	18290	1.0108	0.7183
0.8414	32.0	18880	1.0194	0.7416
0.8741	33.0	19470	1.1150	0.7135
0.8151	34.0	20060	1.1255	0.7385
0.864	35.0	20650	0.9919	0.7336
0.7863	36.0	21240	1.0934	0.7468
0.8047	37.0	21830	1.0928	0.7190
0.7751	38.0	22420	1.0014	0.7477
0.7889	39.0	23010	0.9600	0.7434
0.7376	40.0	23600	1.1391	0.7450
0.7727	41.0	24190	1.0360	0.7453
0.7564	42.0	24780	0.9761	0.7446
0.7398	43.0	25370	1.0142	0.7379
0.73	44.0	25960	1.0133	0.7407
0.7074	45.0	26550	0.9570	0.7431
0.7035	46.0	27140	0.9833	0.7474
0.6909	47.0	27730	1.0047	0.7346
0.7054	48.0	28320	1.0054	0.7440
0.6762	49.0	28910	0.9666	0.7495
0.6722	50.0	29500	0.9731	0.7404
0.6523	51.0	30090	0.9867	0.7422
0.6572	52.0	30680	0.9576	0.7468
0.6577	53.0	31270	0.9527	0.7456
0.6532	54.0	31860	0.9492	0.7453
0.6529	55.0	32450	0.9646	0.7404
0.6303	56.0	33040	0.9561	0.7434
0.6273	57.0	33630	0.9568	0.7465
0.6091	58.0	34220	0.9435	0.7483
0.6205	59.0	34810	0.9537	0.7483
0.6153	60.0	35400	0.9490	0.7434

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_9e-3_10_0.5

2_9e-3_10_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_9e-3_10_0.5

Evaluation results