20230821154607

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3385
Accuracy: 0.7437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.004
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.3899	0.5271
0.5615	2.0	624	0.3545	0.5596
0.5615	3.0	936	0.5571	0.4729
0.4381	4.0	1248	0.3457	0.5379
0.4338	5.0	1560	0.3504	0.5704
0.4338	6.0	1872	0.4047	0.5596
0.4327	7.0	2184	0.3446	0.6065
0.4327	8.0	2496	0.3317	0.6859
0.3696	9.0	2808	0.3344	0.6751
0.3503	10.0	3120	0.3280	0.7292
0.3503	11.0	3432	0.3260	0.6895
0.3459	12.0	3744	0.3253	0.7040
0.3338	13.0	4056	0.3294	0.6895
0.3338	14.0	4368	0.3428	0.6895
0.3271	15.0	4680	0.3216	0.6931
0.3271	16.0	4992	0.3505	0.6787
0.322	17.0	5304	0.3411	0.7148
0.3152	18.0	5616	0.3221	0.7004
0.3152	19.0	5928	0.3259	0.7292
0.3141	20.0	6240	0.3706	0.6570
0.3026	21.0	6552	0.3651	0.6895
0.3026	22.0	6864	0.3609	0.6895
0.3009	23.0	7176	0.3537	0.7076
0.3009	24.0	7488	0.3329	0.7401
0.2977	25.0	7800	0.3269	0.7329
0.2913	26.0	8112	0.3431	0.7292
0.2913	27.0	8424	0.3236	0.7256
0.2898	28.0	8736	0.3209	0.7184
0.2862	29.0	9048	0.3299	0.7329
0.2862	30.0	9360	0.3527	0.7329
0.2812	31.0	9672	0.3402	0.7256
0.2812	32.0	9984	0.3236	0.7437
0.2793	33.0	10296	0.3509	0.7581
0.2692	34.0	10608	0.3250	0.7509
0.2692	35.0	10920	0.3340	0.7473
0.2696	36.0	11232	0.3267	0.7401
0.2668	37.0	11544	0.3485	0.7437
0.2668	38.0	11856	0.3355	0.7509
0.2641	39.0	12168	0.3305	0.7473
0.2641	40.0	12480	0.3309	0.7437
0.2616	41.0	12792	0.3252	0.7509
0.2612	42.0	13104	0.3285	0.7545
0.2612	43.0	13416	0.3412	0.7545
0.2569	44.0	13728	0.3383	0.7437
0.2559	45.0	14040	0.3340	0.7437
0.2559	46.0	14352	0.3475	0.7401
0.2532	47.0	14664	0.3325	0.7401
0.2532	48.0	14976	0.3355	0.7473
0.2508	49.0	15288	0.3478	0.7401
0.2475	50.0	15600	0.3290	0.7365
0.2475	51.0	15912	0.3432	0.7401
0.2488	52.0	16224	0.3493	0.7329
0.2462	53.0	16536	0.3472	0.7437
0.2462	54.0	16848	0.3351	0.7401
0.2456	55.0	17160	0.3470	0.7401
0.2456	56.0	17472	0.3390	0.7401
0.2455	57.0	17784	0.3416	0.7401
0.2433	58.0	18096	0.3366	0.7437
0.2433	59.0	18408	0.3382	0.7437
0.2431	60.0	18720	0.3385	0.7437

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230821154607

20230821154607

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230821154607

Evaluation results