20230822120451

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 11.7866
Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.004
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	13.5886	0.5271
18.7749	2.0	624	13.1889	0.4729
18.7749	3.0	936	12.7687	0.4729
17.8689	4.0	1248	12.3773	0.4729
17.738	5.0	1560	12.5498	0.4729
17.738	6.0	1872	12.3920	0.4729
17.7159	7.0	2184	12.3910	0.4729
17.7159	8.0	2496	12.3585	0.4729
17.6431	9.0	2808	12.3978	0.4729
17.5993	10.0	3120	12.2603	0.4729
17.5993	11.0	3432	12.1054	0.4729
17.5276	12.0	3744	12.1379	0.5271
17.4675	13.0	4056	12.0354	0.5271
17.4675	14.0	4368	12.0828	0.5271
17.4824	15.0	4680	11.9830	0.5271
17.4824	16.0	4992	12.0574	0.4729
17.4065	17.0	5304	12.7325	0.5271
17.4328	18.0	5616	12.0570	0.4729
17.4328	19.0	5928	12.0770	0.4729
17.3925	20.0	6240	12.0314	0.5271
17.3467	21.0	6552	11.9670	0.5271
17.3467	22.0	6864	12.1346	0.5271
17.3575	23.0	7176	12.4856	0.4729
17.3575	24.0	7488	12.8699	0.4729
17.3374	25.0	7800	11.9199	0.5307
17.3162	26.0	8112	11.9558	0.5271
17.3162	27.0	8424	11.9757	0.5271
17.307	28.0	8736	12.2557	0.4729
17.2934	29.0	9048	11.8987	0.4729
17.2934	30.0	9360	12.1451	0.5271
17.2734	31.0	9672	11.9358	0.5271
17.2734	32.0	9984	11.9698	0.5271
17.2631	33.0	10296	11.9269	0.4729
17.2612	34.0	10608	11.9251	0.5271
17.2612	35.0	10920	11.9818	0.4729
17.2473	36.0	11232	12.0614	0.4729
17.2419	37.0	11544	11.8218	0.5271
17.2419	38.0	11856	11.8899	0.4729
17.2188	39.0	12168	11.8847	0.5271
17.2188	40.0	12480	11.8971	0.4729
17.2216	41.0	12792	11.8868	0.5271
17.2037	42.0	13104	11.8386	0.4729
17.2037	43.0	13416	11.8261	0.4729
17.2027	44.0	13728	11.8480	0.4729
17.181	45.0	14040	11.9217	0.5271
17.181	46.0	14352	11.8834	0.4729
17.1823	47.0	14664	11.8595	0.4729
17.1823	48.0	14976	11.8201	0.5271
17.1721	49.0	15288	11.8889	0.4729
17.168	50.0	15600	11.8029	0.5271
17.168	51.0	15912	11.8118	0.4729
17.1493	52.0	16224	11.7825	0.4729
17.1493	53.0	16536	11.8072	0.5271
17.1493	54.0	16848	11.8041	0.5271
17.1256	55.0	17160	11.8140	0.4729
17.1256	56.0	17472	11.8077	0.5271
17.1315	57.0	17784	11.8012	0.5271
17.1204	58.0	18096	11.7970	0.4729
17.1204	59.0	18408	11.7870	0.5271
17.1129	60.0	18720	11.7866	0.4729

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

20230822120451

20230822120451

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/20230822120451

Evaluation results