jinaai/jina-bert-implementation · Fix sorting heuristic

Oct 17, 2023

We saw issues where models instantiated via AutoModel performed poorly on MTEB. During evaluation we saw that most embeddings produced by this model matched those of a working model, with few exceptions in the batches. This appears to be blamed by mixing sorted and np.argsort, which probably use different methods of taking ties when the input contains duplicate. As a consequence, sentences that have a unique length in their batch are embedded properly, but ones with non-unique length may be swapped. I fixed this issue.

Fix sorting heuristic488c8182

Markus28

Oct 17, 2023

Closing in favor of Github PR

Markus28 changed pull request status to closed Oct 17, 2023