jinaai
/

jina-embedding-b-en-v1

Sentence Similarity

sentence-transformers

feature-extraction

Inference Endpoints

Model card Files Files and versions Community

bwang0911 commited on Jul 16, 2023

Commit

6d54182

·

1 Parent(s): 7ae7b7f

Update README.md

Files changed (1) hide show

README.md +35 -0

README.md CHANGED Viewed

@@ -1772,6 +1772,8 @@ We compared the model against `all-minilm-l6-v2`/`all-mpnet-base-v2` from sbert
 ## Usage
 ```python
 !pip install finetuner
 import finetuner
@@ -1784,6 +1786,39 @@ embeddings = finetuner.encode(
 print(finetuner.cos_sim(embeddings[0], embeddings[1]))
 ```
 ## Fine-tuning
 Please consider [Finetuner](https://github.com/jina-ai/finetuner).

 ## Usage
+Usage with Jina AI Finetuner:
 ```python
 !pip install finetuner
 import finetuner
 print(finetuner.cos_sim(embeddings[0], embeddings[1]))
 ```
+Use directly with Huggingface Transformers:
+```python
+import torch
+from transformers import AutoModel, AutoTokenizer
+def mean_pooling(model_output, attention_mask):
+    token_embeddings = model_output[0]
+    input_mask_expanded = (
+        attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
+    )
+    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(
+        input_mask_expanded.sum(1), min=1e-9
+    )
+# Sentences we want sentence embeddings for
+sentences = ['how is the weather today', 'What is the current weather like today?']
+# Load model from HuggingFace Hub
+tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embedding-s-en-v1')
+model = AutoModel.from_pretrained('jinaai/jina-embedding-s-en-v1')
+with torch.inference_mode():
+    encoded_input = tokenizer(
+        sentences, padding=True, truncation=True, return_tensors='pt'
+    )
+    model_output = model.encoder(**encoded_input)
+    embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
+```
 ## Fine-tuning
 Please consider [Finetuner](https://github.com/jina-ai/finetuner).