ekurtic commited on
Commit
794b132
1 Parent(s): fc083b8

Update notes on model prep

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -9,6 +9,7 @@ datasets: squad
9
  # mobilebert-uncased-finetuned-squadv1
10
 
11
  This model is a finetuned version of the [mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased/tree/main) model on the SQuADv1 task.
 
12
 
13
  It is produced as part of the work on the paper [The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models](https://arxiv.org/abs/2203.07259).
14
 
@@ -30,4 +31,5 @@ If you find the model useful, please consider citing our work.
30
  journal={arXiv preprint arXiv:2203.07259},
31
  year={2022}
32
  }
33
- ```
 
9
  # mobilebert-uncased-finetuned-squadv1
10
 
11
  This model is a finetuned version of the [mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased/tree/main) model on the SQuADv1 task.
12
+ To make this TPU-trained model stable when used in PyTorch on GPUs, the original model has been additionally pretrained for one epoch on BookCorpus and English Wikipedia with disabled dropout before finetuning on the SQuADv1 task.
13
 
14
  It is produced as part of the work on the paper [The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models](https://arxiv.org/abs/2203.07259).
15
 
31
  journal={arXiv preprint arXiv:2203.07259},
32
  year={2022}
33
  }
34
+ ```
35
+