20230819211604

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3362
Accuracy: 0.7473

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.004
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.4002	0.5307
0.545	2.0	624	0.4058	0.5379
0.545	3.0	936	0.3972	0.5379
0.4698	4.0	1248	0.4360	0.4729
0.4785	5.0	1560	0.3494	0.5090
0.4785	6.0	1872	0.4100	0.4729
0.4322	7.0	2184	0.5717	0.5307
0.4322	8.0	2496	0.4078	0.5379
0.3946	9.0	2808	0.3304	0.6570
0.36	10.0	3120	0.3318	0.6426
0.36	11.0	3432	0.3275	0.6931
0.3478	12.0	3744	0.3314	0.7148
0.3359	13.0	4056	0.3277	0.7112
0.3359	14.0	4368	0.3307	0.7148
0.3249	15.0	4680	0.3245	0.6968
0.3249	16.0	4992	0.3626	0.6498
0.3253	17.0	5304	0.3567	0.6859
0.3155	18.0	5616	0.3279	0.7112
0.3155	19.0	5928	0.3257	0.7256
0.3145	20.0	6240	0.3337	0.7112
0.3051	21.0	6552	0.3289	0.7365
0.3051	22.0	6864	0.3523	0.6931
0.3015	23.0	7176	0.3459	0.7040
0.3015	24.0	7488	0.3323	0.7076
0.2952	25.0	7800	0.3445	0.7329
0.289	26.0	8112	0.3554	0.7329
0.289	27.0	8424	0.3210	0.7292
0.2876	28.0	8736	0.3204	0.7365
0.2862	29.0	9048	0.3374	0.7509
0.2862	30.0	9360	0.3778	0.7112
0.2814	31.0	9672	0.3352	0.7401
0.2814	32.0	9984	0.3251	0.7256
0.2777	33.0	10296	0.3574	0.7617
0.2698	34.0	10608	0.3330	0.7292
0.2698	35.0	10920	0.3388	0.7220
0.2714	36.0	11232	0.3222	0.7329
0.2695	37.0	11544	0.3482	0.7473
0.2695	38.0	11856	0.3447	0.7437
0.2637	39.0	12168	0.3394	0.7401
0.2637	40.0	12480	0.3264	0.7401
0.2646	41.0	12792	0.3311	0.7401
0.2613	42.0	13104	0.3322	0.7365
0.2613	43.0	13416	0.3411	0.7473
0.2539	44.0	13728	0.3298	0.7581
0.2543	45.0	14040	0.3442	0.7437
0.2543	46.0	14352	0.3399	0.7545
0.2516	47.0	14664	0.3330	0.7473
0.2516	48.0	14976	0.3299	0.7473
0.2509	49.0	15288	0.3407	0.7401
0.2484	50.0	15600	0.3268	0.7581
0.2484	51.0	15912	0.3386	0.7509
0.2491	52.0	16224	0.3323	0.7581
0.2483	53.0	16536	0.3448	0.7473
0.2483	54.0	16848	0.3339	0.7545
0.2452	55.0	17160	0.3343	0.7473
0.2452	56.0	17472	0.3408	0.7509
0.2456	57.0	17784	0.3374	0.7545
0.2429	58.0	18096	0.3360	0.7473
0.2429	59.0	18408	0.3345	0.7545
0.2436	60.0	18720	0.3362	0.7473

Framework versions

Transformers 4.30.0
Pytorch 2.0.1
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230819211604

20230819211604

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230819211604

Evaluation results