mrm8488 commited on
Commit
ced8a92
1 Parent(s): 3b27214

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -0
README.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: es
3
+ tags:
4
+ - QA
5
+ - Q&A
6
+ datasets:
7
+ - BSC-TeMU/SQAC
8
+
9
+ # Spanish Longformer fine-tuned on **SQAC** for Spanish **QA** 📖❓
10
+ [longformer-base-4096-spanish](https://huggingface.co/mrm8488/longformer-base-4096-spanish) fine-tuned on [SQAC](https://huggingface.co/datasets/BSC-TeMU/SQAC) for **Q&A** downstream task.
11
+
12
+
13
+ ## Details of the dataset 📚
14
+
15
+ This dataset contains 6,247 contexts and 18,817 questions with their answers, 1 to 5 for each fragment.
16
+ The sources of the contexts are:
17
+ * Encyclopedic articles from [Wikipedia in Spanish](https://es.wikipedia.org/), used under [CC-by-sa licence](https://creativecommons.org/licenses/by-sa/3.0/legalcode).
18
+ * News from [Wikinews in Spanish](https://es.wikinews.org/), used under [CC-by licence](https://creativecommons.org/licenses/by/2.5/).
19
+ * Text from the Spanish corpus [AnCora](http://clic.ub.edu/corpus/en), which is a mix from diferent newswire and literature sources, used under [CC-by licence](https://creativecommons.org/licenses/by/4.0/legalcode).
20
+ This dataset can be used to build extractive-QA.