2_5e-3_5_0.5

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 1.0090
Accuracy: 0.6991

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.0566	1.0	590	1.9336	0.6208
1.8329	2.0	1180	1.8941	0.6226
1.8027	3.0	1770	1.6503	0.6043
1.7269	4.0	2360	1.7276	0.5180
1.7224	5.0	2950	1.7866	0.6223
1.6611	6.0	3540	1.6363	0.5988
1.6862	7.0	4130	1.7201	0.5593
1.5648	8.0	4720	1.7083	0.6339
1.5735	9.0	5310	1.5898	0.5991
1.5494	10.0	5900	1.6325	0.6385
1.5284	11.0	6490	1.6925	0.6303
1.478	12.0	7080	1.7338	0.5355
1.5236	13.0	7670	1.5156	0.6394
1.46	14.0	8260	1.8612	0.6321
1.4214	15.0	8850	1.4616	0.6471
1.4158	16.0	9440	1.5174	0.6089
1.3776	17.0	10030	1.4633	0.6278
1.344	18.0	10620	1.4902	0.6135
1.3644	19.0	11210	1.3897	0.6615
1.3559	20.0	11800	1.3980	0.6670
1.3053	21.0	12390	1.4601	0.6651
1.3035	22.0	12980	1.3306	0.6700
1.3067	23.0	13570	1.3644	0.6700
1.2856	24.0	14160	1.2897	0.6691
1.2743	25.0	14750	1.3909	0.6691
1.2704	26.0	15340	1.2935	0.6642
1.2606	27.0	15930	1.2985	0.6425
1.2164	28.0	16520	1.3179	0.6761
1.2137	29.0	17110	1.2708	0.6768
1.2185	30.0	17700	1.2182	0.6862
1.1769	31.0	18290	1.2422	0.6682
1.1815	32.0	18880	1.3006	0.6777
1.1648	33.0	19470	1.2125	0.6862
1.1368	34.0	20060	1.1602	0.6661
1.1736	35.0	20650	1.1483	0.6835
1.1383	36.0	21240	1.1702	0.6896
1.1406	37.0	21830	1.1127	0.6835
1.1461	38.0	22420	1.1293	0.6875
1.1199	39.0	23010	1.1855	0.6881
1.0878	40.0	23600	1.1871	0.6902
1.0852	41.0	24190	1.0959	0.6936
1.0873	42.0	24780	1.1361	0.6942
1.0633	43.0	25370	1.0750	0.6911
1.0758	44.0	25960	1.1282	0.6645
1.0446	45.0	26550	1.0763	0.6832
1.0373	46.0	27140	1.0759	0.6817
1.0318	47.0	27730	1.0454	0.6908
1.0354	48.0	28320	1.0636	0.7031
1.0276	49.0	28910	1.0394	0.6927
1.0211	50.0	29500	1.0369	0.7015
1.0021	51.0	30090	1.0366	0.6865
0.983	52.0	30680	1.0274	0.6960
1.0137	53.0	31270	1.0278	0.7028
0.9825	54.0	31860	1.0339	0.6899
0.9792	55.0	32450	1.0142	0.6969
0.9937	56.0	33040	1.0140	0.7024
0.9755	57.0	33630	1.0173	0.6972
0.9517	58.0	34220	1.0078	0.7
0.988	59.0	34810	1.0116	0.7018
0.9702	60.0	35400	1.0090	0.6991

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_5e-3_5_0.5

2_5e-3_5_0.5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_5e-3_5_0.5

Evaluation results