buruzaemon commited on
Commit
84721a8
1 Parent(s): b2445e3

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -16
README.md CHANGED
@@ -19,7 +19,7 @@ model-index:
19
  metrics:
20
  - name: Accuracy
21
  type: accuracy
22
- value: 0.9180645161290323
23
  ---
24
 
25
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -29,12 +29,12 @@ should probably proofread and complete it, then remove this comment. -->
29
 
30
  This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the clinc_oos dataset.
31
  It achieves the following results on the evaluation set:
32
- - Loss: 0.7719
33
- - Accuracy: 0.9181
34
 
35
  ## Model description
36
 
37
- This is an initial example of knowledge-distillation where the student loss is all cross-entropy loss \\(L_{CE}\\) of the ground-truth labels and none of the distillation loss \\(L_{KD}\\).
38
 
39
  ## Intended uses & limitations
40
 
@@ -42,34 +42,30 @@ More information needed
42
 
43
  ## Training and evaluation data
44
 
45
- The training and evaluation data come straight from the `train` and `validation` splits in the clinc_oos dataset, respectively; and tokenized using the `distilbert-base-uncased` tokenization.
46
 
47
  ## Training procedure
48
 
49
- Please see page 224 in Chapter 8: Making Transformers Efficient in Production, Natural Language Processing with Transformers, May 2022.
50
-
51
  ### Training hyperparameters
52
 
53
  The following hyperparameters were used during training:
54
- - num_epochs: 5
55
- - alpha: 1.0
56
- - temperature: 2.0
57
  - learning_rate: 2e-05
58
  - train_batch_size: 48
59
  - eval_batch_size: 48
60
- - seed: 42
61
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
62
  - lr_scheduler_type: linear
 
63
 
64
  ### Training results
65
 
66
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
67
  |:-------------:|:-----:|:----:|:---------------:|:--------:|
68
- | No log | 1.0 | 318 | 3.2882 | 0.7426 |
69
- | 3.7861 | 2.0 | 636 | 1.8744 | 0.8381 |
70
- | 3.7861 | 3.0 | 954 | 1.1567 | 0.8958 |
71
- | 1.6922 | 4.0 | 1272 | 0.8569 | 0.9132 |
72
- | 0.9055 | 5.0 | 1590 | 0.7719 | 0.9181 |
73
 
74
 
75
  ### Framework versions
 
19
  metrics:
20
  - name: Accuracy
21
  type: accuracy
22
+ value: 0.9083870967741936
23
  ---
24
 
25
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
29
 
30
  This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the clinc_oos dataset.
31
  It achieves the following results on the evaluation set:
32
+ - Loss: 0.8080
33
+ - Accuracy: 0.9084
34
 
35
  ## Model description
36
 
37
+ More information needed
38
 
39
  ## Intended uses & limitations
40
 
 
42
 
43
  ## Training and evaluation data
44
 
45
+ More information needed
46
 
47
  ## Training procedure
48
 
 
 
49
  ### Training hyperparameters
50
 
51
  The following hyperparameters were used during training:
 
 
 
52
  - learning_rate: 2e-05
53
  - train_batch_size: 48
54
  - eval_batch_size: 48
55
+ - seed: 8675309
56
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
57
  - lr_scheduler_type: linear
58
+ - num_epochs: 5
59
 
60
  ### Training results
61
 
62
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
63
  |:-------------:|:-----:|:----:|:---------------:|:--------:|
64
+ | No log | 1.0 | 318 | 3.3061 | 0.6681 |
65
+ | 3.8033 | 2.0 | 636 | 1.9122 | 0.8271 |
66
+ | 3.8033 | 3.0 | 954 | 1.1951 | 0.8832 |
67
+ | 1.7323 | 4.0 | 1272 | 0.8907 | 0.9039 |
68
+ | 0.9371 | 5.0 | 1590 | 0.8080 | 0.9084 |
69
 
70
 
71
  ### Framework versions