Update README.md
Browse filesFix Squad reference
README.md
CHANGED
@@ -33,7 +33,7 @@ This is possible because the pruning method lead to structured matrices: to visu
|
|
33 |
In terms of accuracy, its **F1 is 83.22**, compared with 85.85 for , a **F1 drop of 2.63**.
|
34 |
|
35 |
## Fine-Pruning details
|
36 |
-
This model was fine-tuned from the HuggingFace [model](https://huggingface.co/bert-large-uncased-whole-word-masking) uncased checkpoint on [
|
37 |
This model is case-insensitive: it does not make a difference between english and English.
|
38 |
|
39 |
A side-effect of the block pruning is that some of the attention heads are completely removed: 155 heads were removed on a total of 384 (40.4%).
|
|
|
33 |
In terms of accuracy, its **F1 is 83.22**, compared with 85.85 for , a **F1 drop of 2.63**.
|
34 |
|
35 |
## Fine-Pruning details
|
36 |
+
This model was fine-tuned from the HuggingFace [model](https://huggingface.co/bert-large-uncased-whole-word-masking) uncased checkpoint on [SQuAD2.0](https://rajpurkar.github.io/SQuAD-explorer), and distilled from the model [madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2](https://huggingface.co/madlag/bert-large-uncased-whole-word-masking-finetuned-squadv2).
|
37 |
This model is case-insensitive: it does not make a difference between english and English.
|
38 |
|
39 |
A side-effect of the block pruning is that some of the attention heads are completely removed: 155 heads were removed on a total of 384 (40.4%).
|