20230821113948

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3480
Accuracy: 0.5415

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.3785	0.4729
0.363	2.0	624	0.3515	0.4729
0.363	3.0	936	0.3491	0.5343
0.362	4.0	1248	0.3494	0.5199
0.3635	5.0	1560	0.3517	0.4729
0.3635	6.0	1872	0.3486	0.5271
0.3606	7.0	2184	0.3482	0.5271
0.3606	8.0	2496	0.3491	0.5126
0.3595	9.0	2808	0.3511	0.5271
0.3617	10.0	3120	0.3541	0.5235
0.3617	11.0	3432	0.3492	0.4765
0.3597	12.0	3744	0.3539	0.4729
0.3574	13.0	4056	0.3499	0.4801
0.3574	14.0	4368	0.3509	0.4729
0.3588	15.0	4680	0.3487	0.5199
0.3588	16.0	4992	0.3483	0.5307
0.3557	17.0	5304	0.3577	0.5271
0.3554	18.0	5616	0.3488	0.5343
0.3554	19.0	5928	0.3631	0.4729
0.3572	20.0	6240	0.3558	0.5235
0.3571	21.0	6552	0.3499	0.5415
0.3571	22.0	6864	0.3520	0.5271
0.3566	23.0	7176	0.3602	0.5379
0.3566	24.0	7488	0.3489	0.5343
0.3559	25.0	7800	0.3494	0.5307
0.354	26.0	8112	0.3572	0.4729
0.354	27.0	8424	0.3634	0.4729
0.3552	28.0	8736	0.3487	0.5271
0.3541	29.0	9048	0.3487	0.5235
0.3541	30.0	9360	0.3493	0.4838
0.354	31.0	9672	0.3511	0.5379
0.354	32.0	9984	0.3481	0.5343
0.3552	33.0	10296	0.3479	0.5307
0.3535	34.0	10608	0.3483	0.5379
0.3535	35.0	10920	0.3481	0.5343
0.352	36.0	11232	0.3499	0.4765
0.3502	37.0	11544	0.3490	0.5235
0.3502	38.0	11856	0.3483	0.5271
0.3528	39.0	12168	0.3495	0.5343
0.3528	40.0	12480	0.3493	0.5415
0.353	41.0	12792	0.3491	0.5343
0.3527	42.0	13104	0.3566	0.4729
0.3527	43.0	13416	0.3479	0.5271
0.3515	44.0	13728	0.3496	0.4657
0.3526	45.0	14040	0.3518	0.4729
0.3526	46.0	14352	0.3486	0.5415
0.3517	47.0	14664	0.3534	0.4729
0.3517	48.0	14976	0.3503	0.5451
0.352	49.0	15288	0.3482	0.5379
0.3512	50.0	15600	0.3484	0.5415
0.3512	51.0	15912	0.3488	0.5271
0.3521	52.0	16224	0.3513	0.4729
0.3499	53.0	16536	0.3480	0.5307
0.3499	54.0	16848	0.3480	0.5379
0.3503	55.0	17160	0.3481	0.5415
0.3503	56.0	17472	0.3480	0.5307
0.3515	57.0	17784	0.3492	0.4838
0.3507	58.0	18096	0.3481	0.5379
0.3507	59.0	18408	0.3480	0.5379
0.3505	60.0	18720	0.3480	0.5415

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230821113948

20230821113948

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230821113948

Evaluation results