5e-3_10_0.1

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.6700
Accuracy: 0.7365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.9081	0.5271
0.937	2.0	624	0.6140	0.5704
0.937	3.0	936	0.8444	0.4729
0.8284	4.0	1248	0.7307	0.6245
0.8066	5.0	1560	1.2493	0.5487
0.8066	6.0	1872	0.6752	0.6643
0.7461	7.0	2184	0.8410	0.6282
0.7461	8.0	2496	0.7924	0.6390
0.6874	9.0	2808	0.6100	0.7184
0.67	10.0	3120	0.7658	0.6895
0.67	11.0	3432	0.8649	0.6426
0.6374	12.0	3744	0.5784	0.7545
0.5735	13.0	4056	0.5793	0.7292
0.5735	14.0	4368	0.6332	0.7437
0.4712	15.0	4680	0.5207	0.7581
0.4712	16.0	4992	0.5339	0.7292
0.4258	17.0	5304	0.7625	0.7220
0.3712	18.0	5616	0.5492	0.7365
0.3712	19.0	5928	0.5661	0.7437
0.3656	20.0	6240	0.7445	0.7184
0.327	21.0	6552	0.5874	0.7437
0.327	22.0	6864	0.6301	0.7365
0.3015	23.0	7176	0.6740	0.7148
0.3015	24.0	7488	0.6599	0.7220
0.2929	25.0	7800	0.6697	0.7292
0.2609	26.0	8112	0.6871	0.7256
0.2609	27.0	8424	0.6303	0.7220
0.2581	28.0	8736	0.6768	0.7040
0.2504	29.0	9048	0.6986	0.7148
0.2504	30.0	9360	0.6783	0.7148
0.2313	31.0	9672	0.7120	0.7076
0.2313	32.0	9984	0.6227	0.7148
0.2209	33.0	10296	0.6961	0.7220
0.2141	34.0	10608	0.6817	0.7220
0.2141	35.0	10920	0.6810	0.7256
0.2129	36.0	11232	0.6567	0.7292
0.2053	37.0	11544	0.7469	0.7329
0.2053	38.0	11856	0.6684	0.7329
0.2014	39.0	12168	0.6540	0.7329
0.2014	40.0	12480	0.6679	0.7437
0.2012	41.0	12792	0.6582	0.7292
0.1957	42.0	13104	0.6635	0.7292
0.1957	43.0	13416	0.6715	0.7401
0.1903	44.0	13728	0.6628	0.7329
0.1861	45.0	14040	0.6674	0.7329
0.1861	46.0	14352	0.7008	0.7220
0.1858	47.0	14664	0.6371	0.7401
0.1858	48.0	14976	0.6630	0.7437
0.1852	49.0	15288	0.6353	0.7365
0.1868	50.0	15600	0.7010	0.7401
0.1868	51.0	15912	0.6572	0.7365
0.1813	52.0	16224	0.6531	0.7401
0.1807	53.0	16536	0.6413	0.7437
0.1807	54.0	16848	0.6605	0.7473
0.1792	55.0	17160	0.6498	0.7437
0.1792	56.0	17472	0.6865	0.7437
0.1764	57.0	17784	0.6660	0.7365
0.1726	58.0	18096	0.6829	0.7473
0.1726	59.0	18408	0.6730	0.7437
0.1761	60.0	18720	0.6700	0.7365

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

5e-3_10_0.1

5e-3_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/5e-3_10_0.1

Evaluation results