2_5e-3_10_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.6847
Accuracy: 0.7226

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.2209	1.0	590	0.9564	0.6162
1.1661	2.0	1180	0.9456	0.5817
1.1103	3.0	1770	0.9574	0.6214
1.0789	4.0	2360	0.9671	0.6217
1.0422	5.0	2950	1.0276	0.4997
0.9949	6.0	3540	0.8934	0.6312
0.99	7.0	4130	1.5786	0.4119
0.9632	8.0	4720	1.2903	0.6232
0.9329	9.0	5310	0.8528	0.6352
0.9157	10.0	5900	0.8400	0.6557
0.9187	11.0	6490	0.9022	0.6404
0.8408	12.0	7080	0.8227	0.6679
0.8295	13.0	7670	1.4711	0.5606
0.9554	14.0	8260	0.8134	0.6884
0.7759	15.0	8850	0.7988	0.6774
0.7568	16.0	9440	0.9273	0.6031
0.7197	17.0	10030	0.7468	0.6966
0.739	18.0	10620	0.7418	0.6976
0.725	19.0	11210	0.7303	0.7043
0.7215	20.0	11800	0.7322	0.7024
0.7028	21.0	12390	0.7489	0.7073
0.6929	22.0	12980	0.7376	0.7125
0.6907	23.0	13570	0.7165	0.7122
0.6862	24.0	14160	0.7102	0.7101
0.6583	25.0	14750	0.7060	0.7193
0.6713	26.0	15340	0.7305	0.6905
0.6625	27.0	15930	0.7407	0.6914
0.6516	28.0	16520	0.7057	0.7232
0.6465	29.0	17110	0.7047	0.7135
0.6389	30.0	17700	0.7340	0.7272
0.6333	31.0	18290	0.7067	0.7055
0.6212	32.0	18880	0.7071	0.7235
0.6179	33.0	19470	0.6851	0.7202
0.5935	34.0	20060	0.6888	0.7187
0.5851	35.0	20650	0.7105	0.6985
0.5921	36.0	21240	0.6810	0.7284
0.5838	37.0	21830	0.6814	0.7315
0.5746	38.0	22420	0.6984	0.7086
0.5744	39.0	23010	0.6864	0.7214
0.5628	40.0	23600	0.6842	0.7260
0.5694	41.0	24190	0.7091	0.7083
0.5595	42.0	24780	0.6805	0.7214
0.5552	43.0	25370	0.6899	0.7321
0.5553	44.0	25960	0.7324	0.7021
0.5439	45.0	26550	0.6960	0.7122
0.5328	46.0	27140	0.6965	0.7131
0.5367	47.0	27730	0.6844	0.7257
0.5377	48.0	28320	0.6752	0.7275
0.5364	49.0	28910	0.6861	0.7165
0.5224	50.0	29500	0.6903	0.7153
0.5239	51.0	30090	0.6895	0.7202
0.5259	52.0	30680	0.6885	0.7162
0.5235	53.0	31270	0.6772	0.7281
0.5227	54.0	31860	0.7113	0.7141
0.5176	55.0	32450	0.6802	0.7266
0.5116	56.0	33040	0.6807	0.7284
0.5029	57.0	33630	0.6786	0.7239
0.5068	58.0	34220	0.6862	0.7226
0.498	59.0	34810	0.6838	0.7251
0.5037	60.0	35400	0.6847	0.7226

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_5e-3_10_0.1

2_5e-3_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_5e-3_10_0.1

Evaluation results