Example in readme could need explanation

by tordbb - opened Aug 14, 2023

Aug 14, 2023

Hi there,
In the example used, it seems you encode the same list of sentences twice, then you compare those (identical) encodings. Is this what the example intended to show?

embeddings_1 = model.encode(sentences)
embeddings_2 = model.encode(sentences)
similarity = embeddings_1 @ embeddings_2.T
print(similarity)

I hoped to see an example of how you compare the similarity of the two sentences in that list, which possibly could have looked something like this:

embeddings_1 = model.encode(sentences[0])
embeddings_2 = model.encode(sentences[1])
similarity = embeddings_1 @ embeddings_2.T
print(similarity)

Shitao

Beijing Academy of Artificial Intelligence org Aug 14, 2023

Hi, thanks for your interest!
We want to show the method to compute the similarities between two lists of sentences, where you can replace the sentences in embeddings_2 = model.encode(sentences) with any other list of sentences. This method will be helpful when the user needs to find the most similar sentence from a list of sentences.
Your method is true to compute the similarity between sentence_1 and sentence_2.

tordbb

Aug 14, 2023

Thanks for explaining this, I see that makes sense!
What threw me off was the fact that you create both embeddings_1, and embeddings_2 based on the same variable, sentences.
If you want to make the usefulness of this model more obvious to new readers, you may want to change the example to the following:

from FlagEmbedding import FlagModel
sentences_1 = ["样例数据-1", "样例数据-2"]
sentences_2 = ["样例数据-3", "样例数据-4", "样例数据-5"]
model = FlagModel('BAAI/bge-large-zh', query_instruction_for_retrieval="为这个句子生成表示以用于检索相关文章：")
embeddings_1 = model.encode(sentences_1)
embeddings_2 = model.encode(sentences_2)
similarity = embeddings_1 @ embeddings_2.T
print(similarity)

PS; thank you for your work with this model. I just used the model to evaluate the caption generated from an image to the prompt from which the image was generated.
Seems like it did a good job!

Shitao

Beijing Academy of Artificial Intelligence org Aug 14, 2023

Thanks for your advice! I will update the readme.

wilfoderek

Dec 28, 2023

Any recomendation to reproduce the model for spanish?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment