20230820161846

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.3402
Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.3514	0.5560
0.6229	2.0	624	0.6273	0.5487
0.6229	3.0	936	0.8085	0.4729
0.5188	4.0	1248	0.6060	0.4729
0.4693	5.0	1560	0.3607	0.4946
0.4693	6.0	1872	0.3897	0.4801
0.4249	7.0	2184	0.5828	0.5271
0.4249	8.0	2496	0.4718	0.5307
0.4226	9.0	2808	0.5343	0.4729
0.42	10.0	3120	0.3478	0.5451
0.42	11.0	3432	0.4042	0.5271
0.407	12.0	3744	0.5783	0.4693
0.4156	13.0	4056	0.3466	0.5740
0.4156	14.0	4368	0.3720	0.5379
0.3784	15.0	4680	0.3414	0.6318
0.3784	16.0	4992	0.3330	0.6318
0.3734	17.0	5304	0.4631	0.5957
0.3573	18.0	5616	0.3375	0.5848
0.3573	19.0	5928	0.3429	0.6606
0.3516	20.0	6240	0.3344	0.6606
0.3399	21.0	6552	0.3671	0.6679
0.3399	22.0	6864	0.3485	0.6643
0.3345	23.0	7176	0.3416	0.6679
0.3345	24.0	7488	0.3263	0.6968
0.325	25.0	7800	0.3331	0.6895
0.3197	26.0	8112	0.3591	0.6787
0.3197	27.0	8424	0.3175	0.7292
0.3165	28.0	8736	0.3208	0.7148
0.3122	29.0	9048	0.3200	0.7292
0.3122	30.0	9360	0.3790	0.6570
0.3072	31.0	9672	0.3221	0.7112
0.3072	32.0	9984	0.3263	0.7365
0.3041	33.0	10296	0.3322	0.7292
0.2885	34.0	10608	0.3296	0.7365
0.2885	35.0	10920	0.3265	0.7220
0.2875	36.0	11232	0.3236	0.7509
0.2848	37.0	11544	0.3484	0.7112
0.2848	38.0	11856	0.3266	0.7365
0.2766	39.0	12168	0.3304	0.7473
0.2766	40.0	12480	0.3305	0.7401
0.2743	41.0	12792	0.3287	0.7545
0.2708	42.0	13104	0.3292	0.7365
0.2708	43.0	13416	0.3363	0.7256
0.2662	44.0	13728	0.3203	0.7329
0.2636	45.0	14040	0.3338	0.7401
0.2636	46.0	14352	0.3480	0.7365
0.261	47.0	14664	0.3282	0.7401
0.261	48.0	14976	0.3330	0.7329
0.2585	49.0	15288	0.3519	0.7292
0.2561	50.0	15600	0.3215	0.7473
0.2561	51.0	15912	0.3388	0.7401
0.2569	52.0	16224	0.3327	0.7365
0.2544	53.0	16536	0.3402	0.7401
0.2544	54.0	16848	0.3313	0.7437
0.2499	55.0	17160	0.3317	0.7401
0.2499	56.0	17472	0.3465	0.7329
0.2505	57.0	17784	0.3398	0.7437
0.2468	58.0	18096	0.3380	0.7437
0.2468	59.0	18408	0.3370	0.7437
0.2487	60.0	18720	0.3402	0.7401

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230820161846

20230820161846

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230820161846

Evaluation results