2_6e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.5859
Accuracy: 0.7254

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.006
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.0992	1.0	590	1.0242	0.3783
0.8881	2.0	1180	0.9820	0.3817
0.8638	3.0	1770	0.9819	0.3783
0.8712	4.0	2360	0.8440	0.3789
0.8299	5.0	2950	0.7281	0.6217
0.8746	6.0	3540	0.6816	0.6049
0.9153	7.0	4130	0.6879	0.5281
0.8459	8.0	4720	0.6251	0.6333
0.7986	9.0	5310	1.0586	0.6217
0.8116	10.0	5900	0.6938	0.6434
0.789	11.0	6490	0.7268	0.6511
0.7792	12.0	7080	0.6182	0.6593
0.7814	13.0	7670	1.2212	0.4502
0.7899	14.0	8260	0.6923	0.6621
0.7264	15.0	8850	0.6417	0.6706
0.7226	16.0	9440	0.7098	0.5881
0.7009	17.0	10030	0.5964	0.6673
0.7149	18.0	10620	0.7206	0.6141
0.6615	19.0	11210	0.6004	0.6850
0.6847	20.0	11800	0.9306	0.6575
0.6563	21.0	12390	0.7185	0.6823
0.643	22.0	12980	0.6512	0.6502
0.6407	23.0	13570	0.6875	0.6832
0.6207	24.0	14160	0.6471	0.6593
0.5944	25.0	14750	0.6547	0.7080
0.6082	26.0	15340	0.6463	0.6532
0.6005	27.0	15930	0.5753	0.7018
0.5711	28.0	16520	0.5725	0.7119
0.5729	29.0	17110	0.5858	0.7223
0.556	30.0	17700	0.5890	0.7245
0.5549	31.0	18290	0.5599	0.7138
0.5355	32.0	18880	0.7710	0.6945
0.5358	33.0	19470	0.5839	0.7144
0.503	34.0	20060	0.6080	0.7324
0.5149	35.0	20650	0.6178	0.7107
0.5099	36.0	21240	0.5268	0.7275
0.5114	37.0	21830	0.5852	0.7269
0.4823	38.0	22420	0.5647	0.7229
0.4736	39.0	23010	0.6011	0.7339
0.4757	40.0	23600	0.7783	0.7208
0.4761	41.0	24190	0.5780	0.7294
0.464	42.0	24780	0.6204	0.7312
0.4545	43.0	25370	0.5590	0.7214
0.45	44.0	25960	0.6851	0.7156
0.4424	45.0	26550	0.6311	0.7095
0.4276	46.0	27140	0.5536	0.7211
0.4401	47.0	27730	0.5773	0.7269
0.4319	48.0	28320	0.5876	0.7269
0.4211	49.0	28910	0.5829	0.7312
0.4126	50.0	29500	0.6142	0.7232
0.4183	51.0	30090	0.5985	0.7251
0.4045	52.0	30680	0.6185	0.7211
0.4058	53.0	31270	0.6073	0.7336
0.402	54.0	31860	0.6035	0.7232
0.4031	55.0	32450	0.6014	0.7284
0.3964	56.0	33040	0.5933	0.7300
0.3932	57.0	33630	0.5683	0.7263
0.3954	58.0	34220	0.5942	0.7254
0.3898	59.0	34810	0.5832	0.7294
0.3842	60.0	35400	0.5859	0.7254

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_6e-3_1_0.1

2_6e-3_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_6e-3_1_0.1

Evaluation results