20230817153600

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3379
Accuracy: 0.7726

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.5320	0.5199
0.6084	2.0	624	0.5060	0.5307
0.6084	3.0	936	0.4765	0.4729
0.4786	4.0	1248	0.3862	0.4729
0.5253	5.0	1560	0.5091	0.5343
0.5253	6.0	1872	0.3768	0.4982
0.5144	7.0	2184	0.4406	0.5271
0.5144	8.0	2496	0.3461	0.6318
0.4407	9.0	2808	0.3480	0.6534
0.4002	10.0	3120	0.3629	0.6643
0.4002	11.0	3432	0.3949	0.5560
0.3576	12.0	3744	0.3366	0.7076
0.346	13.0	4056	0.3302	0.7040
0.346	14.0	4368	0.3293	0.7184
0.337	15.0	4680	0.3301	0.7292
0.337	16.0	4992	0.3398	0.7329
0.3323	17.0	5304	0.3555	0.7256
0.3245	18.0	5616	0.3257	0.7040
0.3245	19.0	5928	0.3257	0.7292
0.3243	20.0	6240	0.3507	0.7220
0.3144	21.0	6552	0.4047	0.7184
0.3144	22.0	6864	0.3620	0.7220
0.3135	23.0	7176	0.3740	0.7148
0.3135	24.0	7488	0.3315	0.7437
0.3063	25.0	7800	0.3291	0.7437
0.2986	26.0	8112	0.3626	0.7292
0.2986	27.0	8424	0.3281	0.7401
0.2956	28.0	8736	0.3376	0.7401
0.2927	29.0	9048	0.3310	0.7545
0.2927	30.0	9360	0.3471	0.7437
0.2853	31.0	9672	0.3205	0.7581
0.2853	32.0	9984	0.3271	0.7509
0.2861	33.0	10296	0.3423	0.7509
0.2782	34.0	10608	0.3328	0.7473
0.2782	35.0	10920	0.3289	0.7617
0.2756	36.0	11232	0.3309	0.7581
0.2758	37.0	11544	0.3741	0.7365
0.2758	38.0	11856	0.3326	0.7473
0.2714	39.0	12168	0.3611	0.7184
0.2714	40.0	12480	0.3352	0.7473
0.2687	41.0	12792	0.3405	0.7437
0.2685	42.0	13104	0.3408	0.7365
0.2685	43.0	13416	0.3414	0.7473
0.2649	44.0	13728	0.3369	0.7545
0.2615	45.0	14040	0.3371	0.7545
0.2615	46.0	14352	0.3428	0.7509
0.2602	47.0	14664	0.3286	0.7545
0.2602	48.0	14976	0.3316	0.7581
0.2595	49.0	15288	0.3401	0.7545
0.2551	50.0	15600	0.3362	0.7653
0.2551	51.0	15912	0.3434	0.7653
0.2574	52.0	16224	0.3302	0.7726
0.2515	53.0	16536	0.3464	0.7473
0.2515	54.0	16848	0.3337	0.7690
0.252	55.0	17160	0.3364	0.7690
0.252	56.0	17472	0.3418	0.7509
0.2497	57.0	17784	0.3407	0.7581
0.2503	58.0	18096	0.3419	0.7545
0.2503	59.0	18408	0.3376	0.7762
0.2504	60.0	18720	0.3379	0.7726

Framework versions

Transformers 4.30.0
Pytorch 2.0.1
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230817153600

20230817153600

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230817153600

Evaluation results