3e-2_10_0.1

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.6258
Accuracy: 0.5379

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	1.7487	0.5271
3.9657	2.0	624	1.6713	0.4729
3.9657	3.0	936	1.7039	0.4729
2.3312	4.0	1248	4.3023	0.5271
2.1711	5.0	1560	0.8582	0.4729
2.1711	6.0	1872	0.6298	0.4982
1.7918	7.0	2184	1.4449	0.5271
1.7918	8.0	2496	0.6374	0.5271
1.4918	9.0	2808	1.6588	0.4729
1.5706	10.0	3120	0.6965	0.5090
1.5706	11.0	3432	1.0698	0.5271
1.4388	12.0	3744	0.8561	0.4729
1.2519	13.0	4056	0.6604	0.5271
1.2519	14.0	4368	1.1529	0.5271
1.1804	15.0	4680	0.7657	0.4729
1.1804	16.0	4992	0.6331	0.4838
1.1249	17.0	5304	1.2513	0.5271
1.161	18.0	5616	1.5477	0.5271
1.161	19.0	5928	0.6309	0.5126
1.1646	20.0	6240	0.6461	0.5235
1.0512	21.0	6552	1.0072	0.5271
1.0512	22.0	6864	0.7228	0.5271
1.0792	23.0	7176	1.2781	0.4729
1.0792	24.0	7488	0.8418	0.4729
1.0817	25.0	7800	1.0903	0.5271
1.0233	26.0	8112	0.9363	0.5271
1.0233	27.0	8424	0.8552	0.4729
0.982	28.0	8736	0.7299	0.4729
0.926	29.0	9048	0.6380	0.4440
0.926	30.0	9360	1.5393	0.5271
0.9613	31.0	9672	0.7258	0.4729
0.9613	32.0	9984	0.8471	0.5271
0.8893	33.0	10296	0.6271	0.5271
0.904	34.0	10608	0.6718	0.5271
0.904	35.0	10920	0.6358	0.4874
0.9034	36.0	11232	0.9034	0.4729
0.887	37.0	11544	0.7764	0.5271
0.887	38.0	11856	0.6706	0.4729
0.8477	39.0	12168	0.6326	0.5271
0.8477	40.0	12480	0.6265	0.5054
0.8539	41.0	12792	0.6624	0.5271
0.8147	42.0	13104	0.6563	0.5271
0.8147	43.0	13416	0.6304	0.4729
0.8202	44.0	13728	0.6489	0.4729
0.7907	45.0	14040	0.7081	0.5271
0.7907	46.0	14352	0.6311	0.4368
0.7947	47.0	14664	0.6740	0.4729
0.7947	48.0	14976	0.6262	0.5379
0.7523	49.0	15288	0.6370	0.4729
0.7378	50.0	15600	0.6247	0.5271
0.7378	51.0	15912	0.6253	0.5162
0.7219	52.0	16224	0.7281	0.4729
0.7043	53.0	16536	0.6248	0.5271
0.7043	54.0	16848	0.6247	0.5271
0.6898	55.0	17160	0.6630	0.4729
0.6898	56.0	17472	0.6596	0.5271
0.6822	57.0	17784	0.6302	0.5271
0.6656	58.0	18096	0.6270	0.4910
0.6656	59.0	18408	0.6256	0.5271
0.6559	60.0	18720	0.6258	0.5379

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

3e-2_10_0.1

3e-2_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/3e-2_10_0.1

Evaluation results