buruzaemon
commited on
Commit
•
9406510
1
Parent(s):
c19c52c
Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ This model is a fine-tuned version of [distilbert-base-uncased](https://huggingf
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
-
|
22 |
|
23 |
## Intended uses & limitations
|
24 |
|
@@ -26,7 +26,7 @@ More information needed
|
|
26 |
|
27 |
## Training and evaluation data
|
28 |
|
29 |
-
The training and evaluation data come straight from the `train` and `validation` splits in the clinc_oos dataset, respectively; and tokenized using the distilbert-base-uncased tokenization.
|
30 |
|
31 |
## Training procedure
|
32 |
|
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
+
This is an initial example of knowledge-distillation where the student loss is all cross-entropy loss $L_{CE}$ of the ground-truth labels and none of the distillation loss.
|
22 |
|
23 |
## Intended uses & limitations
|
24 |
|
|
|
26 |
|
27 |
## Training and evaluation data
|
28 |
|
29 |
+
The training and evaluation data come straight from the `train` and `validation` splits in the clinc_oos dataset, respectively; and tokenized using the `distilbert-base-uncased` tokenization.
|
30 |
|
31 |
## Training procedure
|
32 |
|