Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ datasets:
|
|
20 |
---
|
21 |
## Model Details: 80% 1x4 Block Sparse BERT-Base (uncased) Fine Tuned on SQuADv1.1
|
22 |
This model has been fine-tuned for the NLP task of question answering, trained on the SQuAD 1.1 dataset. It is a result of fine-tuning a Prune Once For All 80% 1x4 block sparse pre-trained BERT-Base model, combined with knowledge distillation.
|
23 |
-
> We present a new method for training sparse pre-trained Transformer language models by integrating weight pruning and model distillation. These sparse pre-trained models can be used to transfer learning for a wide range of tasks while maintaining their sparsity pattern. We show how the compressed sparse pre-trained models we trained transfer their knowledge to five different downstream natural language tasks with minimal accuracy loss.
|
24 |
|
25 |
|
26 |
|
|
|
20 |
---
|
21 |
## Model Details: 80% 1x4 Block Sparse BERT-Base (uncased) Fine Tuned on SQuADv1.1
|
22 |
This model has been fine-tuned for the NLP task of question answering, trained on the SQuAD 1.1 dataset. It is a result of fine-tuning a Prune Once For All 80% 1x4 block sparse pre-trained BERT-Base model, combined with knowledge distillation.
|
23 |
+
> We present a new method for training sparse pre-trained Transformer language models by integrating weight pruning and model distillation. These sparse pre-trained models can be used to transfer learning for a wide range of tasks while maintaining their sparsity pattern. We show how the compressed sparse pre-trained models we trained transfer their knowledge to five different downstream natural language tasks with minimal accuracy loss.
|
24 |
|
25 |
|
26 |
|