sentence-transformers/distiluse-base-multilingual-cased-v2

Apr 29, 2023

I have the following scenario where I check similarity between my source and targets.

Even though the matcing sentences are more close in semantically, they are not compatible in meaning. Jane vs John. How to avoid such kind of deviation? Maybe using different model designed for this purpose.

Thanks!

TiSy

Jul 10, 2023

Hey, did you find any solution to this problem?

tyk37

Jul 10, 2023

No, unfortunately.

Akseluhr

Apr 18, 2024

In this case I'd try different models and perhaps various methods (e.g. w2v approach - although it is an older method, it might perform better than SOTA LLMs for certain problems). You can also try combining model. If combining models, you can do majority voting in the end (like a classic Random Forest algorithm), or averaging the results etc., depending on the prediction problem. Or, as you said, maybe there are models out there that consider this (if you found one let me know).

As a final experiment, I'd weight up words that are closer to the source. It is perhaps not the most sophisticated solution, but it might do the trick.

sentence-transformers
/

distiluse-base-multilingual-cased-v2

How to avoid meaning deviation?