20230822011214

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 13.1424
Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	34.1366	0.4729
34.4899	2.0	624	31.6158	0.4982
34.4899	3.0	936	29.7502	0.4765
31.3598	4.0	1248	29.3626	0.5018
29.6767	5.0	1560	29.1220	0.4729
29.6767	6.0	1872	28.7672	0.5307
29.2217	7.0	2184	27.2268	0.5126
29.2217	8.0	2496	23.7819	0.4982
27.2285	9.0	2808	20.2651	0.5271
23.6907	10.0	3120	17.8350	0.5271
23.6907	11.0	3432	16.7909	0.4729
21.0475	12.0	3744	16.1897	0.4729
20.1309	13.0	4056	15.7234	0.4729
20.1309	14.0	4368	15.4084	0.4729
19.6553	15.0	4680	15.1657	0.4729
19.6553	16.0	4992	14.9716	0.5271
19.3496	17.0	5304	14.7880	0.5271
19.122	18.0	5616	14.6322	0.4729
19.122	19.0	5928	14.5424	0.4729
18.9517	20.0	6240	14.4178	0.5271
18.7994	21.0	6552	14.2725	0.4729
18.7994	22.0	6864	14.2138	0.5271
18.6835	23.0	7176	14.1064	0.5271
18.6835	24.0	7488	14.0401	0.4729
18.59	25.0	7800	13.9478	0.4729
18.504	26.0	8112	13.9156	0.4729
18.504	27.0	8424	13.8335	0.4729
18.4387	28.0	8736	13.7761	0.4729
18.3758	29.0	9048	13.7312	0.4729
18.3758	30.0	9360	13.6791	0.4729
18.3264	31.0	9672	13.6458	0.5271
18.3264	32.0	9984	13.5991	0.4729
18.2808	33.0	10296	13.5762	0.5271
18.2355	34.0	10608	13.5283	0.4729
18.2355	35.0	10920	13.4919	0.4729
18.2071	36.0	11232	13.4721	0.4729
18.1831	37.0	11544	13.4375	0.4729
18.1831	38.0	11856	13.4097	0.5271
18.1448	39.0	12168	13.4004	0.5271
18.1448	40.0	12480	13.3691	0.5271
18.1182	41.0	12792	13.3430	0.4729
18.1006	42.0	13104	13.3514	0.4729
18.1006	43.0	13416	13.3017	0.4729
18.0785	44.0	13728	13.2838	0.4729
18.0562	45.0	14040	13.2687	0.4729
18.0562	46.0	14352	13.2555	0.4729
18.0454	47.0	14664	13.2510	0.4729
18.0454	48.0	14976	13.2384	0.5271
18.0293	49.0	15288	13.2096	0.4729
18.0221	50.0	15600	13.2013	0.4729
18.0221	51.0	15912	13.1936	0.4729
17.9969	52.0	16224	13.1813	0.4729
17.9919	53.0	16536	13.1736	0.4729
17.9919	54.0	16848	13.1681	0.5271
17.9823	55.0	17160	13.1559	0.4729
17.9823	56.0	17472	13.1537	0.4729
17.9804	57.0	17784	13.1490	0.4729
17.9743	58.0	18096	13.1461	0.4729
17.9743	59.0	18408	13.1429	0.4729
17.9703	60.0	18720	13.1424	0.4729

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230822011214

20230822011214

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230822011214

Evaluation results