Fix sorting heuristic

#3
by Markus28 - opened

We saw issues where models instantiated via AutoModel performed poorly on MTEB. During evaluation we saw that most embeddings produced by this model matched those of a working model, with few exceptions in the batches. This appears to be blamed by mixing sorted and np.argsort, which probably use different methods of taking ties when the input contains duplicate. As a consequence, sentences that have a unique length in their batch are embedded properly, but ones with non-unique length may be swapped. I fixed this issue.

Closing in favor of Github PR

Markus28 changed pull request status to closed

Sign up or log in to comment