Cyrile commited on
Commit
a13cc2a
1 Parent(s): 644c00f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -39,6 +39,9 @@ Model (EN/FR)
39
  | [Bloomz-560m-retriever](https://huggingface.co/cmarkea/bloomz-560m-retriever) | 10 | 44 | 49 | 77 | 86 |
40
  | [Bloomz-3b-retriever](https://huggingface.co/cmarkea/bloomz-3b-retriever) | 9 | 38 | 50 | 78 | 87 |
41
 
 
 
 
42
 
43
  How to Use Blommz-560m-retriever
44
  --------------------------------
 
39
  | [Bloomz-560m-retriever](https://huggingface.co/cmarkea/bloomz-560m-retriever) | 10 | 44 | 49 | 77 | 86 |
40
  | [Bloomz-3b-retriever](https://huggingface.co/cmarkea/bloomz-3b-retriever) | 9 | 38 | 50 | 78 | 87 |
41
 
42
+ It is observed that TF-IDF loses robustness in cross-language scenarios (even showing lower performance than CamemBERT, which is a model specialized in French). This can be explained by the fact that a bag-of-words method cannot support this type of issue because, for a given sentence between two languages, the latent vectors will be significantly different.
43
+
44
+ CamemBERT exhibits poor performance, not because it poorly groups contexts and queries by themes, but because a meta-cluster appears, separating contexts and queries (as illustrated in the image below), making this type of modeling inappropriate in a retriever context.
45
 
46
  How to Use Blommz-560m-retriever
47
  --------------------------------