20230817010018

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3346
Accuracy: 0.6931

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.003
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.4302	0.4585
0.5241	2.0	624	0.3721	0.5379
0.5241	3.0	936	0.4359	0.4693
0.4404	4.0	1248	0.4139	0.4729
0.444	5.0	1560	0.5513	0.5307
0.444	6.0	1872	0.3854	0.4729
0.4526	7.0	2184	0.3593	0.4729
0.4526	8.0	2496	0.3700	0.5271
0.4555	9.0	2808	0.4814	0.4693
0.4401	10.0	3120	0.4095	0.5271
0.4401	11.0	3432	0.5372	0.5415
0.438	12.0	3744	0.3496	0.5271
0.4381	13.0	4056	0.5447	0.5415
0.4381	14.0	4368	0.4662	0.5668
0.4127	15.0	4680	0.3524	0.6282
0.4127	16.0	4992	0.3402	0.6137
0.4123	17.0	5304	0.7254	0.5776
0.4017	18.0	5616	0.3577	0.5632
0.4017	19.0	5928	0.3274	0.6715
0.3919	20.0	6240	0.3557	0.6173
0.3628	21.0	6552	0.3646	0.4946
0.3628	22.0	6864	0.3489	0.5993
0.3556	23.0	7176	0.4147	0.6354
0.3556	24.0	7488	0.3447	0.6931
0.3508	25.0	7800	0.3240	0.6931
0.3419	26.0	8112	0.3411	0.6751
0.3419	27.0	8424	0.3374	0.6931
0.3398	28.0	8736	0.3280	0.6751
0.3426	29.0	9048	0.3681	0.6968
0.3426	30.0	9360	0.3634	0.6823
0.337	31.0	9672	0.3663	0.6570
0.337	32.0	9984	0.3359	0.6931
0.3369	33.0	10296	0.3239	0.6823
0.3335	34.0	10608	0.3313	0.7076
0.3335	35.0	10920	0.3246	0.7040
0.3307	36.0	11232	0.3624	0.6859
0.329	37.0	11544	0.3669	0.6823
0.329	38.0	11856	0.3467	0.7040
0.3287	39.0	12168	0.3498	0.6968
0.3287	40.0	12480	0.3408	0.6931
0.3264	41.0	12792	0.3236	0.7004
0.324	42.0	13104	0.3363	0.7112
0.324	43.0	13416	0.3384	0.6859
0.3244	44.0	13728	0.3388	0.6895
0.3226	45.0	14040	0.3335	0.7004
0.3226	46.0	14352	0.3314	0.7040
0.3222	47.0	14664	0.3278	0.7148
0.3222	48.0	14976	0.3407	0.6931
0.3186	49.0	15288	0.3328	0.7112
0.3183	50.0	15600	0.3363	0.7076
0.3183	51.0	15912	0.3318	0.7040
0.3153	52.0	16224	0.3305	0.7004
0.3152	53.0	16536	0.3502	0.6751
0.3152	54.0	16848	0.3396	0.6823
0.3144	55.0	17160	0.3282	0.7112
0.3144	56.0	17472	0.3449	0.6823
0.3134	57.0	17784	0.3301	0.7148
0.312	58.0	18096	0.3348	0.6931
0.312	59.0	18408	0.3352	0.6931
0.3118	60.0	18720	0.3346	0.6931

Framework versions

Transformers 4.30.0
Pytorch 2.0.1
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230817010018

20230817010018

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230817010018

Evaluation results