Shitao commited on
Commit
4277867
1 Parent(s): 1f5d3ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -209,6 +209,13 @@ print(model.compute_score(sentence_pairs,
209
  - Long Document Retrieval
210
  - MLDR:
211
  ![avatar](./imgs/long.jpg)
 
 
 
 
 
 
 
212
  - NarritiveQA:
213
  ![avatar](./imgs/nqa.jpg)
214
 
 
209
  - Long Document Retrieval
210
  - MLDR:
211
  ![avatar](./imgs/long.jpg)
212
+ Please note that MLDR is a document retrieval dataset we constructed via LLM,
213
+ covering 13 languages, including test set, validation set, and training set.
214
+ We utilized the training set from MLDR to enhance the model's long document retrieval capabilities.
215
+ Therefore, comparing baseline with `Dense w.o.long`(fine-tuning without long document dataset) is more equitable.
216
+ Additionally, this long document retrieval dataset will be open-sourced to address the current lack of open-source multilingual long text retrieval datasets.
217
+ We believe that this data will be helpful for the open-source community in training document retrieval models.
218
+
219
  - NarritiveQA:
220
  ![avatar](./imgs/nqa.jpg)
221