Reranker

#30
by Totole - opened

Hi, thanks a lot for your work !

Two questions:

  • Is the model.compute_score(sentence_pairs, max_passage_length, weights_for_different_modes) just making a score (e.g. cosine) with the embeddings (dense, sparse, colbert) done by the model ? In other words, is it cross-encoding or bi-encoding ?
  • Why does the max_length_token of this model seems to be 514 and not 8000 ?
Beijing Academy of Artificial Intelligence org

Thanks for your interest in our work!

Besides, we release some new rerankers(cross-encoders): https://huggingface.co/BAAI/bge-reranker-v2-m3#model-list . Feel free to use them and provide your feedback.

  • Thanks!
  • I have an error when computed for query above 514 tokens with the model.compute_score function (not with the model.encode)

image.png
Here is my code
image.png
And the call

image.png

  • I have better results (in French) with the Embedder than with the Reranker :) I keep you in touch

Hello, I need more detailed information about the error.

  1. Can you run the code here successfully?
  2. Maybe you can paste your full code here, and then I will test it to see if this error can be reproduced.

For a very weird reason, it works on Colab but not on Azure ML...

Sign up or log in to comment