Update README.md
Browse files
README.md
CHANGED
@@ -209,6 +209,13 @@ print(model.compute_score(sentence_pairs,
|
|
209 |
- Long Document Retrieval
|
210 |
- MLDR:
|
211 |
![avatar](./imgs/long.jpg)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
212 |
- NarritiveQA:
|
213 |
![avatar](./imgs/nqa.jpg)
|
214 |
|
|
|
209 |
- Long Document Retrieval
|
210 |
- MLDR:
|
211 |
![avatar](./imgs/long.jpg)
|
212 |
+
Please note that MLDR is a document retrieval dataset we constructed via LLM,
|
213 |
+
covering 13 languages, including test set, validation set, and training set.
|
214 |
+
We utilized the training set from MLDR to enhance the model's long document retrieval capabilities.
|
215 |
+
Therefore, comparing baseline with `Dense w.o.long`(fine-tuning without long document dataset) is more equitable.
|
216 |
+
Additionally, this long document retrieval dataset will be open-sourced to address the current lack of open-source multilingual long text retrieval datasets.
|
217 |
+
We believe that this data will be helpful for the open-source community in training document retrieval models.
|
218 |
+
|
219 |
- NarritiveQA:
|
220 |
![avatar](./imgs/nqa.jpg)
|
221 |
|