athrado commited on
Commit
b301b9a
1 Parent(s): 25c8264

Training in progress epoch 0

Browse files
Files changed (3) hide show
  1. README.md +14 -19
  2. config.json +1 -1
  3. tf_model.h5 +1 -1
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
  license: apache-2.0
 
3
  tags:
4
  - generated_from_keras_callback
5
  model-index:
@@ -12,49 +13,43 @@ probably proofread and complete it, then remove this comment. -->
12
 
13
  # athrado/bert-finetuned-nli
14
 
15
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [sick](https://huggingface.co/datasets/sick) dataset for natural language inference.
16
  It achieves the following results on the evaluation set:
17
- - Train Loss: 0.0671
18
- - Train Accuracy: 0.9806
19
- - Validation Loss: 0.5285
20
- - Validation Accuracy: 0.8424
21
- - Epoch: 4
22
 
23
  ## Model description
24
 
25
- Example model for educational purposes: fine-tuning the bert-base-uncased model for natural language inference.
26
 
27
  ## Intended uses & limitations
28
 
29
- - Learning about transformer model and fine-tuning
30
- - Natural language inference fine-tuned on small dataset
31
 
32
  ## Training and evaluation data
33
 
34
- The model is evaluated using the sick validation. We report accuracy, and in addition we computed a weighted F1-score of 0.842.
35
 
36
  ## Training procedure
37
 
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
- - optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': False, 'is_legacy_optimizer': False, 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': 2775, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
42
  - training_precision: float32
43
 
44
  ### Training results
45
 
46
  | Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch |
47
  |:----------:|:--------------:|:---------------:|:-------------------:|:-----:|
48
- | 0.5345 | 0.7810 | 0.4586 | 0.8283 | 0 |
49
- | 0.3253 | 0.8797 | 0.3890 | 0.8404 | 1 |
50
- | 0.2070 | 0.9290 | 0.4210 | 0.8303 | 2 |
51
- | 0.1171 | 0.9610 | 0.5200 | 0.8424 | 3 |
52
- | 0.0671 | 0.9806 | 0.5285 | 0.8424 | 4 |
53
 
54
 
55
  ### Framework versions
56
 
57
- - Transformers 4.30.2
58
- - TensorFlow 2.12.0
59
- - Datasets 2.13.1
60
  - Tokenizers 0.13.3
 
1
  ---
2
  license: apache-2.0
3
+ base_model: bert-base-uncased
4
  tags:
5
  - generated_from_keras_callback
6
  model-index:
 
13
 
14
  # athrado/bert-finetuned-nli
15
 
16
+ This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Train Loss: 0.5542
19
+ - Train Accuracy: 0.7587
20
+ - Validation Loss: 0.4403
21
+ - Validation Accuracy: 0.8343
22
+ - Epoch: 0
23
 
24
  ## Model description
25
 
26
+ More information needed
27
 
28
  ## Intended uses & limitations
29
 
30
+ More information needed
 
31
 
32
  ## Training and evaluation data
33
 
34
+ More information needed
35
 
36
  ## Training procedure
37
 
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
+ - optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': False, 'is_legacy_optimizer': False, 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': 2775, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
42
  - training_precision: float32
43
 
44
  ### Training results
45
 
46
  | Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch |
47
  |:----------:|:--------------:|:---------------:|:-------------------:|:-----:|
48
+ | 0.5542 | 0.7587 | 0.4403 | 0.8343 | 0 |
 
 
 
 
49
 
50
 
51
  ### Framework versions
52
 
53
+ - Transformers 4.31.0
54
+ - TensorFlow 2.13.0
 
55
  - Tokenizers 0.13.3
config.json CHANGED
@@ -28,7 +28,7 @@
28
  "num_hidden_layers": 12,
29
  "pad_token_id": 0,
30
  "position_embedding_type": "absolute",
31
- "transformers_version": "4.30.2",
32
  "type_vocab_size": 2,
33
  "use_cache": true,
34
  "vocab_size": 30522
 
28
  "num_hidden_layers": 12,
29
  "pad_token_id": 0,
30
  "position_embedding_type": "absolute",
31
+ "transformers_version": "4.31.0",
32
  "type_vocab_size": 2,
33
  "use_cache": true,
34
  "vocab_size": 30522
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9570b1bc29b01b3af2aade89d8cd5bc73e6b45afa3ea4d70b2df37ed8a5474ce
3
  size 438226204
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b9114f23b29209780a9fe54eb5847b459a6cc473406b1dc93f9db43bc2aefe91
3
  size 438226204