metadata

license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - super_glue
metrics:
  - accuracy
model-index:
  - name: '20230903070300'
    results: []

20230903070300

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.8203
Accuracy: 0.6599

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 80.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	340	0.7251	0.5063
0.7449	2.0	680	0.7348	0.5
0.7388	3.0	1020	0.7304	0.5
0.7388	4.0	1360	0.7639	0.5
0.7384	5.0	1700	0.7316	0.5
0.7376	6.0	2040	0.7268	0.5
0.7376	7.0	2380	0.7263	0.5
0.7328	8.0	2720	0.7333	0.5
0.7266	9.0	3060	0.7533	0.5
0.7266	10.0	3400	0.7247	0.4984
0.7293	11.0	3740	0.7290	0.5172
0.7248	12.0	4080	0.7539	0.5
0.7248	13.0	4420	0.7395	0.5
0.7255	14.0	4760	0.7360	0.5031
0.7271	15.0	5100	0.7278	0.5
0.7271	16.0	5440	0.7314	0.5094
0.7265	17.0	5780	0.7417	0.4984
0.724	18.0	6120	0.7263	0.5
0.724	19.0	6460	0.7272	0.5031
0.723	20.0	6800	0.7283	0.5172
0.7254	21.0	7140	0.7284	0.5047
0.7254	22.0	7480	0.7346	0.4984
0.7254	23.0	7820	0.7295	0.5125
0.7259	24.0	8160	0.7322	0.5047
0.7235	25.0	8500	0.7327	0.5172
0.7235	26.0	8840	0.7300	0.5172
0.7241	27.0	9180	0.7345	0.5016
0.7227	28.0	9520	0.7263	0.5172
0.7227	29.0	9860	0.7341	0.5016
0.7212	30.0	10200	0.7302	0.5125
0.7226	31.0	10540	0.7346	0.5078
0.7226	32.0	10880	0.7606	0.4702
0.7195	33.0	11220	0.7357	0.5063
0.7226	34.0	11560	0.7356	0.5031
0.7226	35.0	11900	0.7397	0.5063
0.7224	36.0	12240	0.7340	0.5157
0.7216	37.0	12580	0.7319	0.5047
0.7216	38.0	12920	0.7298	0.5141
0.7225	39.0	13260	0.7438	0.5016
0.7197	40.0	13600	0.7306	0.5047
0.7197	41.0	13940	0.7279	0.5125
0.7206	42.0	14280	0.7181	0.5502
0.7079	43.0	14620	0.7566	0.5862
0.7079	44.0	14960	0.7480	0.6254
0.6794	45.0	15300	0.6922	0.6630
0.6556	46.0	15640	0.7232	0.6223
0.6556	47.0	15980	0.6961	0.6458
0.6438	48.0	16320	0.7193	0.6458
0.6249	49.0	16660	0.6663	0.6693
0.6117	50.0	17000	0.8045	0.6191
0.6117	51.0	17340	0.6984	0.6630
0.5961	52.0	17680	0.6973	0.6646
0.5831	53.0	18020	0.7606	0.6348
0.5831	54.0	18360	0.7159	0.6614
0.5624	55.0	18700	0.7947	0.6426
0.558	56.0	19040	0.8629	0.6238
0.558	57.0	19380	0.7299	0.6646
0.5461	58.0	19720	0.7642	0.6411
0.5322	59.0	20060	0.7357	0.6661
0.5322	60.0	20400	0.8926	0.6191
0.5253	61.0	20740	0.7845	0.6348
0.5193	62.0	21080	0.7580	0.6614
0.5193	63.0	21420	0.7705	0.6505
0.5169	64.0	21760	0.8464	0.6458
0.5021	65.0	22100	0.8002	0.6536
0.5021	66.0	22440	0.7595	0.6677
0.487	67.0	22780	0.7971	0.6458
0.4977	68.0	23120	0.8245	0.6270
0.4977	69.0	23460	0.8225	0.6379
0.4822	70.0	23800	0.8323	0.6364
0.4802	71.0	24140	0.8205	0.6364
0.4802	72.0	24480	0.8086	0.6520
0.4779	73.0	24820	0.7994	0.6567
0.4801	74.0	25160	0.8206	0.6520
0.4706	75.0	25500	0.8035	0.6442
0.4706	76.0	25840	0.8213	0.6364
0.4738	77.0	26180	0.8128	0.6630
0.4687	78.0	26520	0.8068	0.6567
0.4687	79.0	26860	0.8098	0.6630
0.4598	80.0	27200	0.8203	0.6599

Framework versions

Transformers 4.26.1
Pytorch 2.0.1+cu118
Datasets 2.12.0
Tokenizers 0.13.3