20230816190102

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3431
Accuracy: 0.7004

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.5884	0.5235
0.6001	2.0	624	0.4145	0.4729
0.6001	3.0	936	0.6337	0.4729
0.5343	4.0	1248	0.3934	0.4838
0.5255	5.0	1560	0.5662	0.4729
0.5255	6.0	1872	0.5158	0.5271
0.504	7.0	2184	0.3480	0.5343
0.504	8.0	2496	0.3846	0.5379
0.4941	9.0	2808	0.5111	0.5307
0.5022	10.0	3120	0.4621	0.5271
0.5022	11.0	3432	0.3418	0.6426
0.453	12.0	3744	0.3652	0.5632
0.3879	13.0	4056	0.3451	0.5596
0.3879	14.0	4368	0.3312	0.6426
0.3698	15.0	4680	0.3599	0.6462
0.3698	16.0	4992	0.3947	0.5993
0.3705	17.0	5304	0.3833	0.6173
0.3598	18.0	5616	0.3354	0.6462
0.3598	19.0	5928	0.3395	0.6715
0.3631	20.0	6240	0.3664	0.6390
0.3515	21.0	6552	0.3420	0.6787
0.3515	22.0	6864	0.3483	0.6137
0.3486	23.0	7176	0.3820	0.6498
0.3486	24.0	7488	0.3240	0.7004
0.3437	25.0	7800	0.3300	0.7148
0.3389	26.0	8112	0.3405	0.6787
0.3389	27.0	8424	0.3291	0.6968
0.3363	28.0	8736	0.3338	0.6895
0.3381	29.0	9048	0.3366	0.7220
0.3381	30.0	9360	0.3831	0.6606
0.3302	31.0	9672	0.3300	0.7040
0.3302	32.0	9984	0.3224	0.7040
0.33	33.0	10296	0.3332	0.6787
0.3271	34.0	10608	0.3412	0.7256
0.3271	35.0	10920	0.3197	0.7076
0.3266	36.0	11232	0.3236	0.7148
0.3248	37.0	11544	0.3621	0.6751
0.3248	38.0	11856	0.3330	0.7040
0.3223	39.0	12168	0.3636	0.6823
0.3223	40.0	12480	0.3298	0.7076
0.3205	41.0	12792	0.3224	0.7148
0.3177	42.0	13104	0.3288	0.7256
0.3177	43.0	13416	0.3464	0.6823
0.3167	44.0	13728	0.3567	0.6787
0.3159	45.0	14040	0.3551	0.6895
0.3159	46.0	14352	0.3313	0.7112
0.3131	47.0	14664	0.3233	0.7292
0.3131	48.0	14976	0.3508	0.6751
0.3118	49.0	15288	0.3420	0.7040
0.3088	50.0	15600	0.3410	0.6968
0.3088	51.0	15912	0.3421	0.7040
0.3082	52.0	16224	0.3411	0.7040
0.3068	53.0	16536	0.3616	0.6823
0.3068	54.0	16848	0.3555	0.6715
0.3031	55.0	17160	0.3418	0.7004
0.3031	56.0	17472	0.3460	0.6859
0.3039	57.0	17784	0.3353	0.7148
0.3025	58.0	18096	0.3450	0.7004
0.3025	59.0	18408	0.3427	0.7040
0.3034	60.0	18720	0.3431	0.7004

Framework versions

Transformers 4.30.0
Pytorch 2.0.1
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230816190102

20230816190102

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230816190102

Evaluation results