tau
/

splinter-large-qass

Question Answering

Inference Endpoints

Model card Files Files and versions Community

oriram commited on Sep 3, 2021

Commit

317a7d0

•

1 Parent(s): 3c30874

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ license: apache-2.0
 Splinter-large is the pretrained model discussed in the paper [Few-Shot Question Answering by Pretraining Span Selection](https://aclanthology.org/2021.acl-long.239/) (at ACL 2021). Its original repository can be found [here](https://github.com/oriram/splinter). The model is case-sensitive.
-Note (1): This model **does** contain the pretrained weights for the QASS layer (see paper for details), and therefore the QASS layer is randomly initialized upon loading it. For the model **without** those weights, see [tau/splinter-large](https://huggingface.co/tau/splinter-large).
 Note (2): Splinter-large was trained after the paper was released, so the results are not reported. However, this model outperforms the base model by large margins. For example, on SQuAD, the model is able to reach 80% F1 given only 128 examples, whereas the base model obtains only ~73%). See the results for Splinter-large in the Appendix of [this paper](https://arxiv.org/pdf/2108.05857.pdf).

 Splinter-large is the pretrained model discussed in the paper [Few-Shot Question Answering by Pretraining Span Selection](https://aclanthology.org/2021.acl-long.239/) (at ACL 2021). Its original repository can be found [here](https://github.com/oriram/splinter). The model is case-sensitive.
+Note (1): This model **does** contain the pretrained weights for the QASS layer (see paper for details). For the model **without** those weights, see [tau/splinter-large](https://huggingface.co/tau/splinter-large).
 Note (2): Splinter-large was trained after the paper was released, so the results are not reported. However, this model outperforms the base model by large margins. For example, on SQuAD, the model is able to reach 80% F1 given only 128 examples, whereas the base model obtains only ~73%). See the results for Splinter-large in the Appendix of [this paper](https://arxiv.org/pdf/2108.05857.pdf).