20230822011123

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 12.7559
Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	33.1111	0.4693
33.5632	2.0	624	29.4330	0.4729
33.5632	3.0	936	28.6575	0.4729
29.5796	4.0	1248	27.5594	0.4946
27.7947	5.0	1560	24.0011	0.4729
27.7947	6.0	1872	21.8497	0.5307
24.4291	7.0	2184	18.9382	0.5271
24.4291	8.0	2496	17.0228	0.5271
21.7331	9.0	2808	16.2191	0.5271
20.2434	10.0	3120	15.6640	0.5271
20.2434	11.0	3432	15.3209	0.4729
19.5791	12.0	3744	15.0367	0.4729
19.1759	13.0	4056	14.7859	0.4729
19.1759	14.0	4368	14.5689	0.4729
18.9129	15.0	4680	14.4199	0.4729
18.9129	16.0	4992	14.3070	0.5271
18.725	17.0	5304	14.2007	0.5271
18.5733	18.0	5616	14.0996	0.4729
18.5733	19.0	5928	14.0560	0.4729
18.4591	20.0	6240	13.9476	0.5271
18.3533	21.0	6552	13.8532	0.5271
18.3533	22.0	6864	13.8091	0.5271
18.2596	23.0	7176	13.7278	0.5271
18.2596	24.0	7488	13.6616	0.4729
18.1857	25.0	7800	13.5820	0.4729
18.1091	26.0	8112	13.5658	0.4729
18.1091	27.0	8424	13.4950	0.4729
18.0388	28.0	8736	13.4109	0.4729
17.9676	29.0	9048	13.3571	0.4729
17.9676	30.0	9360	13.3096	0.4729
17.9109	31.0	9672	13.2689	0.5271
17.9109	32.0	9984	13.2199	0.4729
17.8555	33.0	10296	13.1702	0.5271
17.7959	34.0	10608	13.1315	0.4729
17.7959	35.0	10920	13.0977	0.5271
17.7567	36.0	11232	13.0718	0.4729
17.718	37.0	11544	13.0244	0.4729
17.718	38.0	11856	13.0061	0.5271
17.6743	39.0	12168	12.9777	0.5271
17.6743	40.0	12480	12.9545	0.4729
17.6411	41.0	12792	12.9362	0.4729
17.6197	42.0	13104	12.9564	0.4729
17.6197	43.0	13416	12.8934	0.4729
17.598	44.0	13728	12.8824	0.4729
17.5669	45.0	14040	12.8925	0.4729
17.5669	46.0	14352	12.8567	0.4729
17.5513	47.0	14664	12.8525	0.4729
17.5513	48.0	14976	12.8268	0.5271
17.5412	49.0	15288	12.8100	0.4729
17.5282	50.0	15600	12.8056	0.4729
17.5282	51.0	15912	12.7995	0.4729
17.51	52.0	16224	12.7996	0.4729
17.5032	53.0	16536	12.7793	0.4729
17.5032	54.0	16848	12.7732	0.4729
17.4893	55.0	17160	12.7682	0.4729
17.4893	56.0	17472	12.7625	0.4729
17.4874	57.0	17784	12.7641	0.4729
17.4805	58.0	18096	12.7570	0.4729
17.4805	59.0	18408	12.7564	0.4729
17.4784	60.0	18720	12.7559	0.4729

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230822011123

20230822011123

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230822011123

Evaluation results