20230822010704

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 12.2037
Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	33.1191	0.4513
33.504	2.0	624	30.1105	0.5126
33.504	3.0	936	28.6596	0.4729
29.6796	4.0	1248	28.1189	0.5018
28.1744	5.0	1560	24.7761	0.4729
28.1744	6.0	1872	21.9627	0.5235
24.4505	7.0	2184	19.0019	0.5271
24.4505	8.0	2496	17.1277	0.5271
21.3932	9.0	2808	16.1660	0.5271
19.6922	10.0	3120	15.5951	0.5271
19.6922	11.0	3432	15.0824	0.4729
18.9663	12.0	3744	14.8520	0.4729
18.4915	13.0	4056	14.5191	0.4729
18.4915	14.0	4368	14.2798	0.4729
18.1712	15.0	4680	14.1216	0.4729
18.1712	16.0	4992	13.9650	0.5271
17.9497	17.0	5304	13.8237	0.5307
17.7679	18.0	5616	13.7031	0.5271
17.7679	19.0	5928	13.6600	0.4729
17.6276	20.0	6240	13.4947	0.5271
17.4928	21.0	6552	13.3930	0.4729
17.4928	22.0	6864	13.3240	0.5271
17.3723	23.0	7176	13.2304	0.5271
17.3723	24.0	7488	13.1542	0.4729
17.2738	25.0	7800	13.0519	0.5271
17.1691	26.0	8112	13.0350	0.4729
17.1691	27.0	8424	12.9247	0.4729
17.0746	28.0	8736	12.8456	0.5126
16.9881	29.0	9048	12.7944	0.4729
16.9881	30.0	9360	12.7474	0.4729
16.9201	31.0	9672	12.7131	0.5271
16.9201	32.0	9984	12.6670	0.4729
16.8521	33.0	10296	12.6285	0.5271
16.7917	34.0	10608	12.5831	0.4729
16.7917	35.0	10920	12.5488	0.5271
16.7467	36.0	11232	12.5223	0.4729
16.7092	37.0	11544	12.4885	0.4729
16.7092	38.0	11856	12.4606	0.5271
16.6584	39.0	12168	12.4352	0.5271
16.6584	40.0	12480	12.4116	0.4729
16.6245	41.0	12792	12.3909	0.5271
16.5986	42.0	13104	12.4119	0.4729
16.5986	43.0	13416	12.3479	0.5271
16.5728	44.0	13728	12.3328	0.4729
16.5395	45.0	14040	12.3359	0.4729
16.5395	46.0	14352	12.3195	0.4729
16.5222	47.0	14664	12.3031	0.4729
16.5222	48.0	14976	12.2788	0.5271
16.5068	49.0	15288	12.2630	0.5596
16.4947	50.0	15600	12.2533	0.4729
16.4947	51.0	15912	12.2531	0.4729
16.4716	52.0	16224	12.2479	0.4729
16.4646	53.0	16536	12.2272	0.5271
16.4646	54.0	16848	12.2213	0.5271
16.4479	55.0	17160	12.2177	0.4729
16.4479	56.0	17472	12.2112	0.4765
16.447	57.0	17784	12.2106	0.4729
16.4403	58.0	18096	12.2055	0.4729
16.4403	59.0	18408	12.2039	0.4729
16.4371	60.0	18720	12.2037	0.4729

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230822010704

20230822010704

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230822010704

Evaluation results