Update README.md
Browse files
README.md
CHANGED
@@ -2652,6 +2652,44 @@ Jina Embeddings V2 [technical report](https://arxiv.org/abs/2310.19923)
|
|
2652 |
|
2653 |
## Usage
|
2654 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2655 |
You can use Jina Embedding models directly from transformers package:
|
2656 |
```python
|
2657 |
!pip install transformers
|
@@ -2679,8 +2717,9 @@ Alternatively, you can use Jina AI's [Embedding platform](https://jina.ai/embedd
|
|
2679 |
|
2680 |
## Use Jina Embeddings for RAG
|
2681 |
|
2682 |
-
|
2683 |
-
|
|
|
2684 |
|
2685 |
<img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
|
2686 |
|
@@ -2707,27 +2746,4 @@ If you find Jina Embeddings useful in your research, please cite the following p
|
|
2707 |
archivePrefix={arXiv},
|
2708 |
primaryClass={cs.CL}
|
2709 |
}
|
2710 |
-
```
|
2711 |
-
|
2712 |
-
<!--
|
2713 |
-
``` latex
|
2714 |
-
@misc{günther2023jina,
|
2715 |
-
title={Beyond the 512-Token Barrier: Training General-Purpose Text
|
2716 |
-
Embeddings for Large Documents},
|
2717 |
-
author={Michael Günther and Jackmin Ong and Isabelle Mohr and Alaeddine Abdessalem and Tanguy Abel and Mohammad Kalim Akram and Susana Guzman and Georgios Mastrapas and Saba Sturua and Bo Wang},
|
2718 |
-
year={2023},
|
2719 |
-
eprint={2307.11224},
|
2720 |
-
archivePrefix={arXiv},
|
2721 |
-
primaryClass={cs.CL}
|
2722 |
-
}
|
2723 |
-
|
2724 |
-
@misc{günther2023jina,
|
2725 |
-
title={Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models},
|
2726 |
-
author={Michael Günther and Louis Milliken and Jonathan Geuter and Georgios Mastrapas and Bo Wang and Han Xiao},
|
2727 |
-
year={2023},
|
2728 |
-
eprint={2307.11224},
|
2729 |
-
archivePrefix={arXiv},
|
2730 |
-
primaryClass={cs.CL}
|
2731 |
-
}
|
2732 |
-
```
|
2733 |
-
-->
|
|
|
2652 |
|
2653 |
## Usage
|
2654 |
|
2655 |
+
<details><summary>Please apply **mean pooling** when integrating the model.</summary>
|
2656 |
+
<p>
|
2657 |
+
|
2658 |
+
### Why mean pooling?
|
2659 |
+
|
2660 |
+
`mean poooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
|
2661 |
+
It has been proved to be the most effective way to produce high-quality sentence embeddings.
|
2662 |
+
We offer an `encode` function to deal with this.
|
2663 |
+
|
2664 |
+
However, if you would like to do it without using the default `encode` function:
|
2665 |
+
|
2666 |
+
```python
|
2667 |
+
import torch
|
2668 |
+
import torch.nn.functional as F
|
2669 |
+
from transformers import AutoTokenizer, AutoModel
|
2670 |
+
|
2671 |
+
def mean_pooling(model_output, attention_mask):
|
2672 |
+
token_embeddings = model_output[0]
|
2673 |
+
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
|
2674 |
+
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
|
2675 |
+
|
2676 |
+
sentences = ['How is the weather today?', 'What is the current weather like today?']
|
2677 |
+
|
2678 |
+
tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-small-en')
|
2679 |
+
model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-small-en', trust_remote_code=True)
|
2680 |
+
|
2681 |
+
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
|
2682 |
+
|
2683 |
+
with torch.no_grad():
|
2684 |
+
model_output = model(**encoded_input)
|
2685 |
+
|
2686 |
+
embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
|
2687 |
+
embeddings = F.normalize(embeddings, p=2, dim=1)
|
2688 |
+
```
|
2689 |
+
|
2690 |
+
</p>
|
2691 |
+
</details>
|
2692 |
+
|
2693 |
You can use Jina Embedding models directly from transformers package:
|
2694 |
```python
|
2695 |
!pip install transformers
|
|
|
2717 |
|
2718 |
## Use Jina Embeddings for RAG
|
2719 |
|
2720 |
+
According to the latest blog post from [LLama Index](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83),
|
2721 |
+
|
2722 |
+
> In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out.
|
2723 |
|
2724 |
<img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
|
2725 |
|
|
|
2746 |
archivePrefix={arXiv},
|
2747 |
primaryClass={cs.CL}
|
2748 |
}
|
2749 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|