Electro98 commited on
Commit
699d327
1 Parent(s): 4e20810

End of training

Browse files
Files changed (3) hide show
  1. README.md +9 -32
  2. config.json +1 -0
  3. tf_model.h5 +1 -1
README.md CHANGED
@@ -14,11 +14,11 @@ probably proofread and complete it, then remove this comment. -->
14
 
15
  This model is a fine-tuned version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
- - Train Loss: 0.0126
18
- - Validation Loss: 3.6501
19
- - Train F1: 0.4317
20
- - Train Accuracy: 0.5275
21
- - Epoch: 25
22
 
23
  ## Model description
24
 
@@ -37,39 +37,16 @@ More information needed
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
- - optimizer: {'name': 'Adam', 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 81390, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
41
  - training_precision: float32
42
 
43
  ### Training results
44
 
45
  | Train Loss | Validation Loss | Train F1 | Train Accuracy | Epoch |
46
  |:----------:|:---------------:|:--------:|:--------------:|:-----:|
47
- | 1.7183 | 1.4170 | 0.4193 | 0.5821 | 0 |
48
- | 1.2940 | 1.3890 | 0.4335 | 0.5747 | 1 |
49
- | 1.0652 | 1.4416 | 0.4512 | 0.5718 | 2 |
50
- | 0.8341 | 1.5888 | 0.4364 | 0.5533 | 3 |
51
- | 0.6159 | 1.7822 | 0.4280 | 0.5392 | 4 |
52
- | 0.4387 | 1.9859 | 0.4357 | 0.5301 | 5 |
53
- | 0.3102 | 2.2088 | 0.4346 | 0.5257 | 6 |
54
- | 0.2183 | 2.3689 | 0.4353 | 0.5386 | 7 |
55
- | 0.1620 | 2.5631 | 0.4396 | 0.5379 | 8 |
56
- | 0.1254 | 2.6995 | 0.4314 | 0.5342 | 9 |
57
- | 0.0959 | 2.8182 | 0.4285 | 0.5333 | 10 |
58
- | 0.0788 | 2.8996 | 0.4204 | 0.5334 | 11 |
59
- | 0.0677 | 3.0209 | 0.4347 | 0.5318 | 12 |
60
- | 0.0562 | 3.1115 | 0.4282 | 0.5222 | 13 |
61
- | 0.0493 | 3.1710 | 0.4306 | 0.5268 | 14 |
62
- | 0.0435 | 3.1507 | 0.4280 | 0.5322 | 15 |
63
- | 0.0391 | 3.3222 | 0.4165 | 0.5110 | 16 |
64
- | 0.0321 | 3.3243 | 0.4218 | 0.5309 | 17 |
65
- | 0.0298 | 3.3675 | 0.4252 | 0.5307 | 18 |
66
- | 0.0255 | 3.4341 | 0.4148 | 0.5217 | 19 |
67
- | 0.0230 | 3.4253 | 0.4311 | 0.5250 | 20 |
68
- | 0.0195 | 3.5133 | 0.4278 | 0.5233 | 21 |
69
- | 0.0166 | 3.5915 | 0.4277 | 0.5301 | 22 |
70
- | 0.0165 | 3.5547 | 0.4191 | 0.5340 | 23 |
71
- | 0.0142 | 3.6109 | 0.4333 | 0.5362 | 24 |
72
- | 0.0126 | 3.6501 | 0.4317 | 0.5275 | 25 |
73
 
74
 
75
  ### Framework versions
 
14
 
15
  This model is a fine-tuned version of [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on an unknown dataset.
16
  It achieves the following results on the evaluation set:
17
+ - Train Loss: 0.1805
18
+ - Validation Loss: 0.1696
19
+ - Train F1: 0.1393
20
+ - Train Accuracy: 0.1994
21
+ - Epoch: 2
22
 
23
  ## Model description
24
 
 
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
+ - optimizer: {'name': 'Adam', 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 135650, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False}
41
  - training_precision: float32
42
 
43
  ### Training results
44
 
45
  | Train Loss | Validation Loss | Train F1 | Train Accuracy | Epoch |
46
  |:----------:|:---------------:|:--------:|:--------------:|:-----:|
47
+ | 0.2043 | 0.1756 | 0.0314 | 0.0396 | 0 |
48
+ | 0.2031 | 0.1816 | 0.1156 | 0.2583 | 1 |
49
+ | 0.1805 | 0.1696 | 0.1393 | 0.1994 | 2 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
 
52
  ### Framework versions
config.json CHANGED
@@ -74,6 +74,7 @@
74
  "n_heads": 12,
75
  "n_layers": 6,
76
  "pad_token_id": 0,
 
77
  "qa_dropout": 0.1,
78
  "seq_classif_dropout": 0.2,
79
  "sinusoidal_pos_embds": false,
 
74
  "n_heads": 12,
75
  "n_layers": 6,
76
  "pad_token_id": 0,
77
+ "problem_type": "multi_label_classification",
78
  "qa_dropout": 0.1,
79
  "seq_classif_dropout": 0.2,
80
  "sinusoidal_pos_embds": false,
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:31d3d358a2efc883e5b03b8e23e9a9dc9385652d85f801db235697f9b397d722
3
  size 268031680
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd472e2aea5edcf1e1ffadb87858b13d76ec72cb6d0e3783ebfca4190cc0d653
3
  size 268031680