20230820105148

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3319
Accuracy: 0.7292

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.003
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.4103	0.5271
0.4925	2.0	624	0.4974	0.5451
0.4925	3.0	936	0.3594	0.5704
0.4459	4.0	1248	0.4183	0.4693
0.44	5.0	1560	0.5487	0.5271
0.44	6.0	1872	0.3475	0.5379
0.4177	7.0	2184	0.6254	0.5271
0.4177	8.0	2496	0.3665	0.5884
0.3945	9.0	2808	0.4198	0.4982
0.4112	10.0	3120	0.3320	0.6823
0.4112	11.0	3432	0.3367	0.6173
0.359	12.0	3744	0.3249	0.6931
0.3421	13.0	4056	0.3311	0.6679
0.3421	14.0	4368	0.3228	0.6968
0.3351	15.0	4680	0.3210	0.7148
0.3351	16.0	4992	0.3376	0.6787
0.3289	17.0	5304	0.3285	0.6895
0.3761	18.0	5616	0.3637	0.4801
0.3761	19.0	5928	0.3538	0.5415
0.3983	20.0	6240	0.3642	0.5307
0.3472	21.0	6552	0.3444	0.6931
0.3472	22.0	6864	0.3312	0.7040
0.3194	23.0	7176	0.3450	0.6751
0.3194	24.0	7488	0.3325	0.6823
0.314	25.0	7800	0.3312	0.7220
0.3081	26.0	8112	0.3333	0.7040
0.3081	27.0	8424	0.3184	0.7184
0.3084	28.0	8736	0.3162	0.7112
0.3058	29.0	9048	0.3241	0.7184
0.3058	30.0	9360	0.3549	0.6751
0.3033	31.0	9672	0.3269	0.7184
0.3033	32.0	9984	0.3243	0.7004
0.3	33.0	10296	0.3370	0.7220
0.2906	34.0	10608	0.3198	0.7292
0.2906	35.0	10920	0.3237	0.7148
0.2934	36.0	11232	0.3207	0.7112
0.2921	37.0	11544	0.3450	0.7076
0.2921	38.0	11856	0.3338	0.7112
0.2873	39.0	12168	0.3207	0.7220
0.2873	40.0	12480	0.3233	0.7329
0.2861	41.0	12792	0.3212	0.7148
0.2852	42.0	13104	0.3255	0.7112
0.2852	43.0	13416	0.3353	0.7256
0.2787	44.0	13728	0.3332	0.7220
0.2796	45.0	14040	0.3427	0.7220
0.2796	46.0	14352	0.3407	0.7256
0.2759	47.0	14664	0.3203	0.7256
0.2759	48.0	14976	0.3333	0.7220
0.2761	49.0	15288	0.3283	0.7401
0.2734	50.0	15600	0.3187	0.7292
0.2734	51.0	15912	0.3298	0.7365
0.274	52.0	16224	0.3276	0.7401
0.2717	53.0	16536	0.3342	0.7292
0.2717	54.0	16848	0.3322	0.7292
0.2686	55.0	17160	0.3277	0.7329
0.2686	56.0	17472	0.3357	0.7292
0.2699	57.0	17784	0.3334	0.7365
0.2664	58.0	18096	0.3303	0.7292
0.2664	59.0	18408	0.3320	0.7292
0.2672	60.0	18720	0.3319	0.7292

Framework versions

Transformers 4.30.0
Pytorch 2.0.1
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230820105148

20230820105148

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230820105148

Evaluation results