20230821213736

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 22.0684
Accuracy: 0.4801

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	34.4593	0.4729
34.6035	2.0	624	34.1903	0.4729
34.6035	3.0	936	33.9397	0.5343
34.2607	4.0	1248	33.6773	0.5343
33.8346	5.0	1560	33.3601	0.4729
33.8346	6.0	1872	32.9334	0.5235
33.2988	7.0	2184	32.4093	0.5451
33.2988	8.0	2496	31.6614	0.5343
32.523	9.0	2808	31.1242	0.5487
31.6421	10.0	3120	30.7433	0.5271
31.6421	11.0	3432	30.4265	0.4910
30.9414	12.0	3744	30.1340	0.4729
30.3998	13.0	4056	29.6940	0.4729
30.3998	14.0	4368	29.2574	0.4838
29.7765	15.0	4680	28.9204	0.4729
29.7765	16.0	4992	28.7916	0.4729
29.2672	17.0	5304	28.7245	0.5379
29.0545	18.0	5616	28.6656	0.4729
29.0545	19.0	5928	28.6131	0.4729
28.9469	20.0	6240	28.5471	0.5126
28.8473	21.0	6552	28.4760	0.5343
28.8473	22.0	6864	28.3978	0.4765
28.7322	23.0	7176	28.3073	0.5271
28.7322	24.0	7488	28.1897	0.4729
28.5992	25.0	7800	28.0411	0.4729
28.4123	26.0	8112	27.8587	0.4729
28.4123	27.0	8424	27.6169	0.4729
28.1552	28.0	8736	27.2253	0.5018
27.7135	29.0	9048	26.7643	0.4729
27.7135	30.0	9360	26.2981	0.4693
27.1493	31.0	9672	25.9554	0.4874
27.1493	32.0	9984	25.6574	0.5018
26.68	33.0	10296	25.3846	0.4729
26.3235	34.0	10608	25.0976	0.4729
26.3235	35.0	10920	24.8303	0.4874
25.9833	36.0	11232	24.5811	0.4729
25.6663	37.0	11544	24.3341	0.4874
25.6663	38.0	11856	24.1074	0.4729
25.3808	39.0	12168	23.9099	0.4874
25.3808	40.0	12480	23.7138	0.5343
25.12	41.0	12792	23.5439	0.4874
24.8956	42.0	13104	23.3745	0.4729
24.8956	43.0	13416	23.2148	0.5162
24.6833	44.0	13728	23.0665	0.4765
24.498	45.0	14040	22.9456	0.4729
24.498	46.0	14352	22.8208	0.4729
24.3449	47.0	14664	22.7087	0.4693
24.3449	48.0	14976	22.6159	0.4910
24.1996	49.0	15288	22.5243	0.4874
24.0892	50.0	15600	22.4457	0.4801
24.0892	51.0	15912	22.3728	0.4838
23.9876	52.0	16224	22.3081	0.4874
23.9068	53.0	16536	22.2526	0.4729
23.9068	54.0	16848	22.2029	0.4801
23.837	55.0	17160	22.1624	0.4874
23.837	56.0	17472	22.1289	0.4765
23.7911	57.0	17784	22.1029	0.4729
23.7521	58.0	18096	22.0854	0.4729
23.7521	59.0	18408	22.0726	0.4765
23.7328	60.0	18720	22.0684	0.4801

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230821213736

20230821213736

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230821213736

Evaluation results