20230822011246

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 12.0925
Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	23.5855	0.5271
27.3295	2.0	624	15.7672	0.4729
27.3295	3.0	936	14.1816	0.5271
19.6736	4.0	1248	13.5811	0.4729
18.8481	5.0	1560	13.3851	0.4729
18.8481	6.0	1872	13.0199	0.4729
18.5899	7.0	2184	12.9497	0.4838
18.5899	8.0	2496	12.9961	0.4729
18.473	9.0	2808	12.8275	0.4729
18.3073	10.0	3120	12.6992	0.4729
18.3073	11.0	3432	13.5160	0.5271
18.2739	12.0	3744	12.6731	0.5307
18.1236	13.0	4056	12.6066	0.4729
18.1236	14.0	4368	12.5802	0.4729
18.1096	15.0	4680	12.6447	0.5271
18.1096	16.0	4992	13.3094	0.4729
18.1134	17.0	5304	13.0970	0.5271
18.1098	18.0	5616	12.7293	0.5271
18.1098	19.0	5928	12.6166	0.5271
18.0277	20.0	6240	12.5606	0.4729
18.0289	21.0	6552	12.5322	0.4729
18.0289	22.0	6864	12.7341	0.5271
18.0223	23.0	7176	12.5497	0.4729
18.0223	24.0	7488	12.4199	0.5271
17.9317	25.0	7800	12.7868	0.5271
17.9107	26.0	8112	12.3295	0.4729
17.9107	27.0	8424	12.6038	0.4729
17.8944	28.0	8736	12.3329	0.5271
17.8667	29.0	9048	12.3034	0.5271
17.8667	30.0	9360	12.4605	0.5271
17.8228	31.0	9672	12.5110	0.4729
17.8228	32.0	9984	12.4227	0.5271
17.8006	33.0	10296	12.2972	0.4729
17.76	34.0	10608	12.3011	0.4729
17.76	35.0	10920	12.2179	0.4729
17.7564	36.0	11232	12.2381	0.4729
17.7084	37.0	11544	12.8747	0.4729
17.7084	38.0	11856	12.1945	0.4729
17.7035	39.0	12168	12.2180	0.4729
17.7035	40.0	12480	12.2830	0.4729
17.6668	41.0	12792	12.1857	0.4693
17.6396	42.0	13104	12.2239	0.5379
17.6396	43.0	13416	12.1584	0.5271
17.6452	44.0	13728	12.3185	0.4729
17.6074	45.0	14040	12.2421	0.5271
17.6074	46.0	14352	12.1912	0.4729
17.6167	47.0	14664	12.2022	0.5271
17.6167	48.0	14976	12.1326	0.4729
17.5782	49.0	15288	12.1550	0.4729
17.562	50.0	15600	12.2250	0.5271
17.562	51.0	15912	12.1190	0.4729
17.5409	52.0	16224	12.1505	0.5271
17.5211	53.0	16536	12.1046	0.4729
17.5211	54.0	16848	12.1132	0.5271
17.5043	55.0	17160	12.1159	0.4729
17.5043	56.0	17472	12.1085	0.5271
17.4952	57.0	17784	12.1024	0.4729
17.4731	58.0	18096	12.0955	0.4729
17.4731	59.0	18408	12.0981	0.5271
17.4654	60.0	18720	12.0925	0.4729

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230822011246

20230822011246

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230822011246

Evaluation results