mrm8488
/

longformer-base-4096-spanish-finetuned-squad

Question Answering

Inference Endpoints

Model card Files Files and versions Community

mrm8488 commited on Jan 11, 2022

Commit

ced8a92

•

1 Parent(s): 3b27214

Create README.md

Files changed (1) hide show

README.md +20 -0

README.md ADDED Viewed

	@@ -0,0 +1,20 @@

+---
+language: es
+tags:
+- QA
+- Q&A
+datasets:
+- BSC-TeMU/SQAC
+# Spanish Longformer fine-tuned on **SQAC** for Spanish **QA** 📖❓
+[longformer-base-4096-spanish](https://huggingface.co/mrm8488/longformer-base-4096-spanish) fine-tuned on [SQAC](https://huggingface.co/datasets/BSC-TeMU/SQAC) for **Q&A** downstream task.
+## Details of the dataset 📚
+This dataset contains 6,247 contexts and 18,817 questions with their answers, 1 to 5 for each fragment.
+The sources of the contexts are:
+* Encyclopedic articles from [Wikipedia in Spanish](https://es.wikipedia.org/), used under [CC-by-sa licence](https://creativecommons.org/licenses/by-sa/3.0/legalcode).
+* News from [Wikinews in Spanish](https://es.wikinews.org/), used under [CC-by licence](https://creativecommons.org/licenses/by/2.5/).
+* Text from the Spanish corpus [AnCora](http://clic.ub.edu/corpus/en), which is a mix from diferent newswire and literature sources, used under [CC-by licence](https://creativecommons.org/licenses/by/4.0/legalcode).
+This dataset can be used to build extractive-QA.