ssmits commited on
Commit
0750bb8
·
verified ·
1 Parent(s): d8c012c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -1
README.md CHANGED
@@ -20,4 +20,37 @@ KeyError: 'qwen2'
20
  ```
21
 
22
  ## Usage
23
- The 'lm_head' layer of this model has been removed, which means it can be used for embeddings. It will not perform greatly, as it needs to be further fine-tuned, as shown by [intfloat/e5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  ```
21
 
22
  ## Usage
23
+ The 'lm_head' layer of this model has been removed, which means it can be used for embeddings. It will not perform greatly, as it needs to be further fine-tuned, as shown by [intfloat/e5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct).
24
+
25
+ ## Inference
26
+ ```python
27
+ from sentence_transformers import SentenceTransformer
28
+ import torch
29
+
30
+ # 1. Load a pretrained Sentence Transformer model
31
+ model = SentenceTransformer("ssmits/Qwen2-7B-embed-base", device = "cpu")
32
+
33
+ # The sentences to encode
34
+ sentences = [
35
+ "The weather is lovely today.",
36
+ "It's so sunny outside!",
37
+ "He drove to the stadium.",
38
+ ]
39
+
40
+ # 2. Calculate embeddings by calling model.encode()
41
+ embeddings = model.encode(sentences)
42
+ print(embeddings.shape)
43
+ # (3, 3584)
44
+
45
+ # 3. Calculate the embedding similarities
46
+ # Assuming embeddings is a numpy array, convert it to a torch tensor
47
+ embeddings_tensor = torch.tensor(embeddings)
48
+
49
+ # Using torch to compute cosine similarity matrix
50
+ similarities = torch.nn.functional.cosine_similarity(embeddings_tensor.unsqueeze(0), embeddings_tensor.unsqueeze(1), dim=2)
51
+
52
+ print(similarities)
53
+ # tensor([[1.0000, 0.8735, 0.7051],
54
+ # [0.8735, 1.0000, 0.7199],
55
+ # [0.7051, 0.7199, 1.0000]])
56
+ ```