Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ influenced by its behavior in a monolingual context (English or French).
|
|
21 |
|
22 |
## Dataset
|
23 |
The training dataset comprises the [mMARCO dataset](https://huggingface.co/datasets/unicamp-dl/mmarco), consisting of query/positive/hard negative triplets. Additionally,
|
24 |
-
we have included [SQuAD](https://huggingface.co/datasets/rajpurkar/squad) data from the train split, forming query/positive/hard negative triplets. To generate hard
|
25 |
negative data for SQuAD, we considered contexts from the same theme as the query but from a different set of queries. Hence, the negative observations address the same
|
26 |
themes as the queries but presumably do not contain the answer to the question.
|
27 |
|
|
|
21 |
|
22 |
## Dataset
|
23 |
The training dataset comprises the [mMARCO dataset](https://huggingface.co/datasets/unicamp-dl/mmarco), consisting of query/positive/hard negative triplets. Additionally,
|
24 |
+
we have included [SQuAD](https://huggingface.co/datasets/rajpurkar/squad) data from the "train" split, forming query/positive/hard negative triplets. To generate hard
|
25 |
negative data for SQuAD, we considered contexts from the same theme as the query but from a different set of queries. Hence, the negative observations address the same
|
26 |
themes as the queries but presumably do not contain the answer to the question.
|
27 |
|