20230817123430

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3373
Accuracy: 0.7437

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.003
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.3735	0.5054
0.5142	2.0	624	0.4848	0.5415
0.5142	3.0	936	0.3802	0.5379
0.4859	4.0	1248	0.4823	0.4729
0.4412	5.0	1560	0.3902	0.5379
0.4412	6.0	1872	0.3744	0.5596
0.4418	7.0	2184	0.4612	0.5487
0.4418	8.0	2496	0.4590	0.4729
0.4467	9.0	2808	0.4777	0.4729
0.4177	10.0	3120	0.3616	0.4838
0.4177	11.0	3432	0.3736	0.6245
0.3988	12.0	3744	0.3464	0.5993
0.3911	13.0	4056	0.3522	0.6282
0.3911	14.0	4368	0.3406	0.6859
0.3893	15.0	4680	0.4223	0.6570
0.3893	16.0	4992	0.6759	0.5415
0.38	17.0	5304	0.3631	0.6823
0.3772	18.0	5616	0.3434	0.6931
0.3772	19.0	5928	0.3344	0.6137
0.3639	20.0	6240	0.3670	0.6968
0.336	21.0	6552	0.3483	0.6895
0.336	22.0	6864	0.3485	0.7148
0.3369	23.0	7176	0.3541	0.7184
0.3369	24.0	7488	0.3346	0.7112
0.3291	25.0	7800	0.3387	0.7365
0.3228	26.0	8112	0.3492	0.7220
0.3228	27.0	8424	0.3334	0.7040
0.3206	28.0	8736	0.3388	0.7401
0.3189	29.0	9048	0.3304	0.7365
0.3189	30.0	9360	0.3566	0.7292
0.3148	31.0	9672	0.3370	0.7329
0.3148	32.0	9984	0.3328	0.7292
0.31	33.0	10296	0.3422	0.7437
0.306	34.0	10608	0.3339	0.7292
0.306	35.0	10920	0.3254	0.7292
0.3032	36.0	11232	0.3330	0.7473
0.3028	37.0	11544	0.3718	0.7184
0.3028	38.0	11856	0.3294	0.7473
0.3005	39.0	12168	0.3465	0.7329
0.3005	40.0	12480	0.3334	0.7292
0.2965	41.0	12792	0.3239	0.7256
0.2947	42.0	13104	0.3322	0.7329
0.2947	43.0	13416	0.3370	0.7401
0.2909	44.0	13728	0.3385	0.7473
0.2915	45.0	14040	0.3365	0.7329
0.2915	46.0	14352	0.3435	0.7365
0.29	47.0	14664	0.3301	0.7437
0.29	48.0	14976	0.3443	0.7401
0.2872	49.0	15288	0.3393	0.7437
0.2838	50.0	15600	0.3291	0.7437
0.2838	51.0	15912	0.3356	0.7401
0.2865	52.0	16224	0.3307	0.7365
0.2823	53.0	16536	0.3413	0.7401
0.2823	54.0	16848	0.3353	0.7437
0.28	55.0	17160	0.3315	0.7365
0.28	56.0	17472	0.3433	0.7365
0.2832	57.0	17784	0.3338	0.7401
0.2794	58.0	18096	0.3367	0.7401
0.2794	59.0	18408	0.3371	0.7401
0.2785	60.0	18720	0.3373	0.7437

Framework versions

Transformers 4.30.0
Pytorch 2.0.1
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230817123430

20230817123430

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230817123430

Evaluation results