1e-2_10_0.1

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.6265
Accuracy: 0.5126

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 8
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	312	0.6257	0.5307
1.3002	2.0	624	1.0407	0.5271
1.3002	3.0	936	1.4050	0.5271
1.0663	4.0	1248	0.9796	0.5271
1.0554	5.0	1560	1.4166	0.5271
1.0554	6.0	1872	0.9151	0.5271
0.8523	7.0	2184	0.8469	0.5271
0.8523	8.0	2496	0.8390	0.5271
0.8445	9.0	2808	0.7439	0.4729
0.8722	10.0	3120	0.6458	0.5343
0.8722	11.0	3432	0.7906	0.4729
0.8432	12.0	3744	0.6429	0.4946
0.7932	13.0	4056	0.6503	0.5307
0.7932	14.0	4368	0.7167	0.5271
0.7687	15.0	4680	0.6584	0.4765
0.7687	16.0	4992	0.6324	0.4874
0.7569	17.0	5304	0.7912	0.5271
0.7369	18.0	5616	0.7309	0.4729
0.7369	19.0	5928	0.6402	0.5126
0.7632	20.0	6240	0.7055	0.5271
0.7321	21.0	6552	0.6247	0.5271
0.7321	22.0	6864	0.7055	0.5271
0.7151	23.0	7176	0.6276	0.5343
0.7151	24.0	7488	0.6245	0.5271
0.7092	25.0	7800	0.6266	0.5126
0.7311	26.0	8112	0.6983	0.5271
0.7311	27.0	8424	0.6762	0.4729
0.7027	28.0	8736	0.6316	0.5018
0.7007	29.0	9048	0.6505	0.4729
0.7007	30.0	9360	0.7682	0.5271
0.6974	31.0	9672	0.6616	0.5271
0.6974	32.0	9984	0.6322	0.5271
0.6974	33.0	10296	0.6302	0.5271
0.6786	34.0	10608	0.6764	0.4729
0.6786	35.0	10920	0.6569	0.4729
0.692	36.0	11232	0.6584	0.4729
0.6814	37.0	11544	0.6636	0.5271
0.6814	38.0	11856	0.6477	0.4729
0.6767	39.0	12168	0.6294	0.5271
0.6767	40.0	12480	0.6487	0.4585
0.6762	41.0	12792	0.6301	0.5307
0.6682	42.0	13104	0.6252	0.5271
0.6682	43.0	13416	0.6249	0.5271
0.6738	44.0	13728	0.6334	0.5271
0.667	45.0	14040	0.6248	0.5271
0.667	46.0	14352	0.6390	0.5090
0.6633	47.0	14664	0.6622	0.4729
0.6633	48.0	14976	0.6267	0.4874
0.6573	49.0	15288	0.6256	0.5271
0.6559	50.0	15600	0.6306	0.4838
0.6559	51.0	15912	0.6412	0.4729
0.6455	52.0	16224	0.6634	0.4729
0.6484	53.0	16536	0.6247	0.5271
0.6484	54.0	16848	0.6267	0.5271
0.6417	55.0	17160	0.6295	0.4838
0.6417	56.0	17472	0.6256	0.5271
0.6395	57.0	17784	0.6268	0.4946
0.6418	58.0	18096	0.6267	0.4838
0.6418	59.0	18408	0.6260	0.5271
0.6373	60.0	18720	0.6265	0.5126

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1e-2_10_0.1

1e-2_10_0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1e-2_10_0.1

Evaluation results