File size: 5,069 Bytes
94cbf9e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- super_glue
metrics:
- accuracy
model-index:
- name: 2_4e-3_1_0.1
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# 2_4e-3_1_0.1

This model is a fine-tuned version of [bert-large-uncased](https://huggingface.co/bert-large-uncased) on the super_glue dataset.
It achieves the following results on the evaluation set:
- Loss: 0.5293
- Accuracy: 0.7272

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.004
- train_batch_size: 16
- eval_batch_size: 8
- seed: 11
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 60.0

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Accuracy |
|:-------------:|:-----:|:-----:|:---------------:|:--------:|
| 0.8756        | 1.0   | 590   | 0.9984          | 0.6211   |
| 0.8309        | 2.0   | 1180  | 0.7494          | 0.6217   |
| 0.8162        | 3.0   | 1770  | 0.8910          | 0.3826   |
| 0.8025        | 4.0   | 2360  | 0.6504          | 0.6028   |
| 0.8059        | 5.0   | 2950  | 0.6535          | 0.5945   |
| 0.768         | 6.0   | 3540  | 0.6293          | 0.6291   |
| 0.7423        | 7.0   | 4130  | 0.9356          | 0.4339   |
| 0.7272        | 8.0   | 4720  | 0.7985          | 0.6220   |
| 0.7076        | 9.0   | 5310  | 0.6240          | 0.6541   |
| 0.6803        | 10.0  | 5900  | 0.6284          | 0.6639   |
| 0.6637        | 11.0  | 6490  | 0.6013          | 0.6691   |
| 0.6217        | 12.0  | 7080  | 0.5783          | 0.6725   |
| 0.6169        | 13.0  | 7670  | 0.5657          | 0.6841   |
| 0.5962        | 14.0  | 8260  | 0.6273          | 0.6618   |
| 0.5937        | 15.0  | 8850  | 0.5982          | 0.6725   |
| 0.5811        | 16.0  | 9440  | 0.6778          | 0.5997   |
| 0.5534        | 17.0  | 10030 | 0.5478          | 0.7028   |
| 0.5641        | 18.0  | 10620 | 0.5615          | 0.7034   |
| 0.5588        | 19.0  | 11210 | 0.5467          | 0.7076   |
| 0.5611        | 20.0  | 11800 | 0.5505          | 0.7058   |
| 0.5423        | 21.0  | 12390 | 0.5617          | 0.7086   |
| 0.5372        | 22.0  | 12980 | 0.5483          | 0.7003   |
| 0.5387        | 23.0  | 13570 | 0.5560          | 0.7113   |
| 0.5274        | 24.0  | 14160 | 0.5278          | 0.7131   |
| 0.5242        | 25.0  | 14750 | 0.5377          | 0.7150   |
| 0.5256        | 26.0  | 15340 | 0.5796          | 0.6856   |
| 0.5203        | 27.0  | 15930 | 0.5456          | 0.6976   |
| 0.5087        | 28.0  | 16520 | 0.5365          | 0.7199   |
| 0.5127        | 29.0  | 17110 | 0.5419          | 0.7049   |
| 0.5005        | 30.0  | 17700 | 0.5417          | 0.7257   |
| 0.5008        | 31.0  | 18290 | 0.5257          | 0.7116   |
| 0.4959        | 32.0  | 18880 | 0.5463          | 0.7232   |
| 0.4931        | 33.0  | 19470 | 0.5251          | 0.7260   |
| 0.4849        | 34.0  | 20060 | 0.5282          | 0.7217   |
| 0.4733        | 35.0  | 20650 | 0.5296          | 0.7199   |
| 0.4842        | 36.0  | 21240 | 0.5230          | 0.7229   |
| 0.4811        | 37.0  | 21830 | 0.5264          | 0.7232   |
| 0.4683        | 38.0  | 22420 | 0.5518          | 0.7058   |
| 0.4692        | 39.0  | 23010 | 0.5256          | 0.7300   |
| 0.4621        | 40.0  | 23600 | 0.5292          | 0.7303   |
| 0.4624        | 41.0  | 24190 | 0.5467          | 0.7110   |
| 0.4618        | 42.0  | 24780 | 0.5189          | 0.7324   |
| 0.465         | 43.0  | 25370 | 0.5285          | 0.7330   |
| 0.453         | 44.0  | 25960 | 0.5577          | 0.7113   |
| 0.4533        | 45.0  | 26550 | 0.5170          | 0.7343   |
| 0.4524        | 46.0  | 27140 | 0.5219          | 0.7223   |
| 0.4454        | 47.0  | 27730 | 0.5367          | 0.7257   |
| 0.4401        | 48.0  | 28320 | 0.5251          | 0.7339   |
| 0.4547        | 49.0  | 28910 | 0.5300          | 0.7254   |
| 0.4374        | 50.0  | 29500 | 0.5318          | 0.7278   |
| 0.444         | 51.0  | 30090 | 0.5317          | 0.7239   |
| 0.4363        | 52.0  | 30680 | 0.5309          | 0.7306   |
| 0.4381        | 53.0  | 31270 | 0.5206          | 0.7312   |
| 0.4314        | 54.0  | 31860 | 0.5283          | 0.7269   |
| 0.4334        | 55.0  | 32450 | 0.5254          | 0.7278   |
| 0.43          | 56.0  | 33040 | 0.5317          | 0.7278   |
| 0.4194        | 57.0  | 33630 | 0.5261          | 0.7272   |
| 0.4341        | 58.0  | 34220 | 0.5266          | 0.7300   |
| 0.4243        | 59.0  | 34810 | 0.5269          | 0.7275   |
| 0.4191        | 60.0  | 35400 | 0.5293          | 0.7272   |


### Framework versions

- Transformers 4.30.0
- Pytorch 2.0.1+cu117
- Datasets 2.14.4
- Tokenizers 0.13.3