lewtun HF staff commited on
Commit
d988483
1 Parent(s): 6b90b97

Add model card

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ thumbnail: https://github.com/karanchahal/distiller/blob/master/distiller.jpg
5
+ tags:
6
+ - question-answeringlicense: Apache 2.0
7
+ datasets:
8
+ - squad
9
+ metrics:
10
+ - squad
11
+ ---# DistilBERT with a second step of distillation
12
+
13
+ ## Model description
14
+
15
+ This model replicates the "DistilBERT (D)" model from Table 2 of the [DistilBERT paper](https://arxiv.org/pdf/1910.01108.pdf). In this approach, a DistilBERT student is fine-tuned on SQuAD v1.1, while a fine-tuned BERT model acts as a teacher for a second step of task-specific distillation.In this version, the following pre-trained models were used:* Student: `distilbert-base-uncased`* Teacher: `maroo93/squad1.1`## Intended uses & limitations
16
+
17
+ #### How to use
18
+
19
+ ## Training data
20
+
21
+ This model was trained on the SQuAD v1.1 dataset which can be obtained from the `datasets` library as follows:```pythonfrom datasets import load_datasetsquad = load_dataset('squad')```
22
+ ## Training procedure
23
+
24
+
25
+ ## Eval results
26
+
27
+ Exact Match | F1
28
+ -------|---------
29
+ 78.05 | 86.09
30
+
31
+
32
+ The score was calculated using the `squad` metric from `datasets`.
33
+
34
+ ### BibTeX entry and citation info
35
+
36
+ ```bibtex
37
+ @misc{sanh2020distilbert,
38
+ title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
39
+ author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},
40
+ year={2020},
41
+ eprint={1910.01108},
42
+ archivePrefix={arXiv},
43
+ primaryClass={cs.CL}
44
+ }
45
+ ```