sentence-transformers/all-MiniLM-L6-v2 · Using embeddings to do sentence similarity

May 18, 2023

Has anyone used the embeddings to calculate sentence similarity like the example card? If so, what are the steps you took to do this?

mintujohnson

May 22, 2023

•

edited May 22, 2023

This is actually a straight forward task, thanks to huggingface/sentence transformers utilities.
We just need to compare the embeddings using a similarity score utility.

Step 1: Encode the sentences to be compared

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
embeddings1 = model.encode(sentences1, convert_to_tensor=True)
embeddings2 = model.encode(sentences2, convert_to_tensor=True)

(where, sentencs1 and sentences2 are list of sentences(strings))

Step 2: Compute the similarity using a similarity matrix

(cosine similarity or dot product)

from sentence_transformers import util
cosine_scores = util.cos_sim(embeddings1, embeddings2)

Step 3: Output the pairs with their score

for i in range(len(sentences1)): print("{} \t\t {} \t\t Score: {:.4f}".format(sentences1[i], sentences2[i], cosine_scores[i][i]))

For more references, you can visit Sentence-Transformers website:
https://www.sbert.net/docs/usage/semantic_textual_similarity.html

gerkim62

Aug 19, 2023

hi