20230817181727

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3316
Accuracy: 0.7365

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.004
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	156	0.4741	0.5307
No log	2.0	312	0.3849	0.5090
No log	3.0	468	0.4345	0.4729
0.5496	4.0	624	0.4749	0.5235
0.5496	5.0	780	0.4138	0.5343
0.5496	6.0	936	0.3599	0.5632
0.4365	7.0	1092	0.3954	0.5632
0.4365	8.0	1248	0.3455	0.5018
0.4365	9.0	1404	0.3985	0.5776
0.4109	10.0	1560	0.3828	0.5993
0.4109	11.0	1716	0.4339	0.4729
0.4109	12.0	1872	0.3432	0.5379
0.3611	13.0	2028	0.3395	0.6137
0.3611	14.0	2184	0.3404	0.6715
0.3611	15.0	2340	0.3396	0.6570
0.3611	16.0	2496	0.3857	0.6354
0.3456	17.0	2652	0.3480	0.6895
0.3456	18.0	2808	0.3348	0.7040
0.3456	19.0	2964	0.3323	0.6426
0.3391	20.0	3120	0.3591	0.6715
0.3391	21.0	3276	0.3378	0.7148
0.3391	22.0	3432	0.3453	0.7004
0.3319	23.0	3588	0.3405	0.6679
0.3319	24.0	3744	0.3451	0.6390
0.3319	25.0	3900	0.3665	0.6895
0.3274	26.0	4056	0.3290	0.7112
0.3274	27.0	4212	0.3252	0.7040
0.3274	28.0	4368	0.3265	0.7184
0.3214	29.0	4524	0.3284	0.7365
0.3214	30.0	4680	0.3290	0.7437
0.3214	31.0	4836	0.3328	0.7256
0.3214	32.0	4992	0.3268	0.7220
0.3167	33.0	5148	0.3372	0.7220
0.3167	34.0	5304	0.3263	0.7256
0.3167	35.0	5460	0.3231	0.7365
0.312	36.0	5616	0.3255	0.7256
0.312	37.0	5772	0.3325	0.7148
0.312	38.0	5928	0.3351	0.7365
0.3083	39.0	6084	0.3362	0.7148
0.3083	40.0	6240	0.3326	0.7292
0.3083	41.0	6396	0.3366	0.7220
0.3081	42.0	6552	0.3265	0.7292
0.3081	43.0	6708	0.3351	0.7365
0.3081	44.0	6864	0.3384	0.7329
0.3032	45.0	7020	0.3298	0.7220
0.3032	46.0	7176	0.3309	0.7329
0.3032	47.0	7332	0.3319	0.7256
0.3032	48.0	7488	0.3452	0.7401
0.2998	49.0	7644	0.3365	0.7365
0.2998	50.0	7800	0.3290	0.7256
0.2998	51.0	7956	0.3251	0.7509
0.2989	52.0	8112	0.3254	0.7401
0.2989	53.0	8268	0.3372	0.7365
0.2989	54.0	8424	0.3401	0.7437
0.2951	55.0	8580	0.3315	0.7365
0.2951	56.0	8736	0.3345	0.7292
0.2951	57.0	8892	0.3301	0.7292
0.2945	58.0	9048	0.3322	0.7292
0.2945	59.0	9204	0.3305	0.7329
0.2945	60.0	9360	0.3316	0.7365

Framework versions

Transformers 4.30.0
Pytorch 2.0.1
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230817181727

20230817181727

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230817181727

Evaluation results