20230817093322

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3504
Accuracy: 0.7256

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.003
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.5126	0.5235
0.5126	2.0	624	0.3824	0.4765
0.5126	3.0	936	0.3692	0.4910
0.4613	4.0	1248	0.3941	0.5343
0.446	5.0	1560	0.6773	0.5271
0.446	6.0	1872	0.5516	0.5271
0.4477	7.0	2184	0.3517	0.5199
0.4477	8.0	2496	0.3772	0.4910
0.4263	9.0	2808	0.3690	0.4838
0.4397	10.0	3120	0.3512	0.4838
0.4397	11.0	3432	0.4716	0.5379
0.4425	12.0	3744	0.3605	0.6570
0.4269	13.0	4056	0.3571	0.5379
0.4269	14.0	4368	0.3545	0.4838
0.3975	15.0	4680	0.3744	0.6498
0.3975	16.0	4992	0.3578	0.6606
0.3906	17.0	5304	0.3704	0.6931
0.3633	18.0	5616	0.3356	0.6065
0.3633	19.0	5928	0.3397	0.6065
0.3604	20.0	6240	0.3809	0.6931
0.3565	21.0	6552	0.3357	0.6787
0.3565	22.0	6864	0.3803	0.6209
0.3533	23.0	7176	0.3754	0.6751
0.3533	24.0	7488	0.3304	0.6354
0.3462	25.0	7800	0.3700	0.6968
0.3432	26.0	8112	0.3337	0.7148
0.3432	27.0	8424	0.3289	0.6968
0.3409	28.0	8736	0.3340	0.7148
0.3381	29.0	9048	0.3467	0.7220
0.3381	30.0	9360	0.3860	0.6823
0.337	31.0	9672	0.3795	0.6931
0.337	32.0	9984	0.3755	0.7184
0.334	33.0	10296	0.3529	0.7112
0.3321	34.0	10608	0.3389	0.7076
0.3321	35.0	10920	0.3260	0.7148
0.3315	36.0	11232	0.3519	0.7329
0.3317	37.0	11544	0.3741	0.6968
0.3317	38.0	11856	0.3364	0.7112
0.325	39.0	12168	0.3438	0.7256
0.325	40.0	12480	0.3462	0.7148
0.3282	41.0	12792	0.3344	0.7256
0.3251	42.0	13104	0.3280	0.7256
0.3251	43.0	13416	0.3544	0.7148
0.3223	44.0	13728	0.3488	0.7256
0.3215	45.0	14040	0.3437	0.7220
0.3215	46.0	14352	0.3430	0.7220
0.3205	47.0	14664	0.3394	0.7076
0.3205	48.0	14976	0.3676	0.7076
0.3163	49.0	15288	0.3487	0.7365
0.3154	50.0	15600	0.3387	0.7148
0.3154	51.0	15912	0.3448	0.7076
0.3164	52.0	16224	0.3361	0.7220
0.3153	53.0	16536	0.3676	0.7040
0.3153	54.0	16848	0.3463	0.7256
0.3145	55.0	17160	0.3491	0.7329
0.3145	56.0	17472	0.3599	0.7040
0.3151	57.0	17784	0.3457	0.7292
0.3103	58.0	18096	0.3489	0.7220
0.3103	59.0	18408	0.3481	0.7256
0.314	60.0	18720	0.3504	0.7256

Framework versions

Transformers 4.30.0
Pytorch 2.0.1
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230817093322

20230817093322

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230817093322

Evaluation results