20230821215812

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 11.6614
Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	30.4563	0.5162
31.8281	2.0	624	28.2683	0.4729
31.8281	3.0	936	22.4829	0.4729
26.6026	4.0	1248	17.2508	0.4729
20.7188	5.0	1560	15.6956	0.5271
20.7188	6.0	1872	14.7599	0.4729
18.7808	7.0	2184	14.4331	0.5271
18.7808	8.0	2496	13.9366	0.5271
18.0838	9.0	2808	13.6340	0.4729
17.722	10.0	3120	13.4379	0.4729
17.722	11.0	3432	13.4393	0.4729
17.4783	12.0	3744	13.1376	0.4729
17.2699	13.0	4056	12.9599	0.4729
17.2699	14.0	4368	12.8480	0.4729
17.0966	15.0	4680	12.7813	0.4729
17.0966	16.0	4992	12.6920	0.5271
16.9613	17.0	5304	12.5694	0.5271
16.848	18.0	5616	12.5194	0.5271
16.848	19.0	5928	12.4591	0.4729
16.7661	20.0	6240	12.3827	0.5271
16.6825	21.0	6552	12.3410	0.4729
16.6825	22.0	6864	12.3241	0.5271
16.5963	23.0	7176	12.3296	0.5271
16.5963	24.0	7488	12.2611	0.4729
16.5513	25.0	7800	12.1515	0.5271
16.4926	26.0	8112	12.1194	0.4729
16.4926	27.0	8424	12.1052	0.4729
16.4398	28.0	8736	12.0516	0.5271
16.399	29.0	9048	12.0210	0.4946
16.399	30.0	9360	12.0054	0.4729
16.3657	31.0	9672	11.9960	0.5271
16.3657	32.0	9984	11.9548	0.5271
16.3306	33.0	10296	11.9332	0.5271
16.294	34.0	10608	11.9148	0.4729
16.294	35.0	10920	11.9225	0.4729
16.2657	36.0	11232	11.8726	0.4765
16.2465	37.0	11544	11.8452	0.4729
16.2465	38.0	11856	11.8341	0.5271
16.208	39.0	12168	11.8232	0.4729
16.208	40.0	12480	11.7979	0.4729
16.191	41.0	12792	11.7895	0.4729
16.1729	42.0	13104	11.8391	0.4729
16.1729	43.0	13416	11.7619	0.5271
16.1571	44.0	13728	11.7502	0.4729
16.1268	45.0	14040	11.7520	0.4729
16.1268	46.0	14352	11.7539	0.4729
16.1194	47.0	14664	11.7541	0.4729
16.1194	48.0	14976	11.7130	0.5271
16.11	49.0	15288	11.7020	0.5271
16.0989	50.0	15600	11.6949	0.4729
16.0989	51.0	15912	11.7026	0.4729
16.0802	52.0	16224	11.7056	0.4729
16.0765	53.0	16536	11.6793	0.5271
16.0765	54.0	16848	11.6759	0.5271
16.0629	55.0	17160	11.6712	0.4729
16.0629	56.0	17472	11.6660	0.4946
16.0619	57.0	17784	11.6662	0.4729
16.0566	58.0	18096	11.6643	0.4729
16.0566	59.0	18408	11.6616	0.4729
16.0547	60.0	18720	11.6614	0.4729

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230821215812

20230821215812

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230821215812

Evaluation results