2_2e-3_1_0.1

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.5541
Accuracy: 0.7003

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.002
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.8034	1.0	590	0.6537	0.6217
0.8338	2.0	1180	0.7014	0.6217
0.8142	3.0	1770	0.6716	0.5596
0.7701	4.0	2360	0.6599	0.6217
0.7412	5.0	2950	0.7053	0.6217
0.7414	6.0	3540	0.6539	0.6217
0.7411	7.0	4130	0.9828	0.3817
0.7237	8.0	4720	0.6571	0.6061
0.7339	9.0	5310	0.6448	0.6232
0.7005	10.0	5900	0.6632	0.6223
0.7171	11.0	6490	0.6442	0.6220
0.7084	12.0	7080	0.7522	0.4477
0.6985	13.0	7670	0.6253	0.6336
0.7044	14.0	8260	0.7021	0.6217
0.6752	15.0	8850	0.6321	0.6183
0.6817	16.0	9440	0.6388	0.6073
0.6715	17.0	10030	0.6276	0.6358
0.6591	18.0	10620	0.6297	0.6474
0.6681	19.0	11210	0.6139	0.6407
0.6595	20.0	11800	0.6048	0.6541
0.6463	21.0	12390	0.6135	0.6541
0.6391	22.0	12980	0.6181	0.6437
0.6407	23.0	13570	0.6047	0.6615
0.6226	24.0	14160	0.6077	0.6615
0.6271	25.0	14750	0.6129	0.6642
0.6288	26.0	15340	0.6329	0.6343
0.6254	27.0	15930	0.5903	0.6728
0.6085	28.0	16520	0.5946	0.6743
0.6107	29.0	17110	0.5848	0.6737
0.5917	30.0	17700	0.6179	0.6725
0.5997	31.0	18290	0.5991	0.6618
0.5877	32.0	18880	0.6386	0.6709
0.5894	33.0	19470	0.5830	0.6771
0.5804	34.0	20060	0.5765	0.6856
0.5751	35.0	20650	0.5944	0.6615
0.5825	36.0	21240	0.5702	0.6890
0.5824	37.0	21830	0.5807	0.6774
0.5671	38.0	22420	0.5671	0.6838
0.573	39.0	23010	0.5678	0.6862
0.5615	40.0	23600	0.5685	0.6893
0.5658	41.0	24190	0.5820	0.6792
0.5669	42.0	24780	0.5692	0.6902
0.5663	43.0	25370	0.5665	0.6881
0.5533	44.0	25960	0.5599	0.6920
0.5552	45.0	26550	0.5637	0.6905
0.5515	46.0	27140	0.5616	0.6893
0.5593	47.0	27730	0.5650	0.6887
0.5487	48.0	28320	0.5620	0.6948
0.5563	49.0	28910	0.5631	0.6911
0.5486	50.0	29500	0.5604	0.6972
0.5464	51.0	30090	0.5590	0.6939
0.5469	52.0	30680	0.5561	0.6969
0.5458	53.0	31270	0.5573	0.7
0.5425	54.0	31860	0.5558	0.6976
0.5412	55.0	32450	0.5552	0.6991
0.5434	56.0	33040	0.5564	0.6979
0.5363	57.0	33630	0.5536	0.6982
0.5404	58.0	34220	0.5556	0.6982
0.5378	59.0	34810	0.5542	0.6991
0.5431	60.0	35400	0.5541	0.7003

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_2e-3_1_0.1

2_2e-3_1_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_2e-3_1_0.1

Evaluation results