Question about model types

#1
by cyt79 - opened

Hi, thanks for sharing all of these great models! I'm wondering if you can tell a bit more about which model can be used as bi-encoder and which can be used as cross-encoder. For instance, does it make sense to use this model to initialise CrossEncoder of Sentence-Transformers as shown below?

from sentence_transformers import CrossEncoder
model = CrossEncoder('Muennighoff/SGPT-2.7B-weightedmean-nli-bitfit')

Hey! That does not make sense; The uploaded SGPT models are all Bi-Encoders.
I havn't experiment with sentence_transformers.CrossEncoder - The SGPT methodology for Cross-Encoders is to use the log probabilities of raw pre-trained GPT models like e.g. https://huggingface.co/EleutherAI/gpt-j-6b. You can check the example scripts here for that: https://github.com/Muennighoff/sgpt#cross-encoder

Ah got it. I wasn't sure what models are Bi-Encoders and what models are cross encoders. Thanks for the clarification!

Sign up or log in to comment