Files changed (1) hide show
  1. README.md +42 -26
README.md CHANGED
@@ -2652,6 +2652,44 @@ Jina Embeddings V2 [technical report](https://arxiv.org/abs/2310.19923)
2652
 
2653
  ## Usage
2654
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2655
  You can use Jina Embedding models directly from transformers package:
2656
  ```python
2657
  !pip install transformers
@@ -2679,8 +2717,9 @@ Alternatively, you can use Jina AI's [Embedding platform](https://jina.ai/embedd
2679
 
2680
  ## Use Jina Embeddings for RAG
2681
 
2682
- Jina Embeddings are very effective for retrieval augmented generation (RAG).
2683
- Ravi Theja wrote a [blog post](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83) on using Jina Embeddings together with [LLama Index](https://github.com/run-llama/llama_index) for RAG:
 
2684
 
2685
  <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
2686
 
@@ -2707,27 +2746,4 @@ If you find Jina Embeddings useful in your research, please cite the following p
2707
  archivePrefix={arXiv},
2708
  primaryClass={cs.CL}
2709
  }
2710
- ```
2711
-
2712
- <!--
2713
- ``` latex
2714
- @misc{günther2023jina,
2715
- title={Beyond the 512-Token Barrier: Training General-Purpose Text
2716
- Embeddings for Large Documents},
2717
- author={Michael Günther and Jackmin Ong and Isabelle Mohr and Alaeddine Abdessalem and Tanguy Abel and Mohammad Kalim Akram and Susana Guzman and Georgios Mastrapas and Saba Sturua and Bo Wang},
2718
- year={2023},
2719
- eprint={2307.11224},
2720
- archivePrefix={arXiv},
2721
- primaryClass={cs.CL}
2722
- }
2723
-
2724
- @misc{günther2023jina,
2725
- title={Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models},
2726
- author={Michael Günther and Louis Milliken and Jonathan Geuter and Georgios Mastrapas and Bo Wang and Han Xiao},
2727
- year={2023},
2728
- eprint={2307.11224},
2729
- archivePrefix={arXiv},
2730
- primaryClass={cs.CL}
2731
- }
2732
- ```
2733
- -->
 
2652
 
2653
  ## Usage
2654
 
2655
+ <details><summary>Please apply **mean pooling** when integrating the model.</summary>
2656
+ <p>
2657
+
2658
+ ### Why mean pooling?
2659
+
2660
+ `mean poooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
2661
+ It has been proved to be the most effective way to produce high-quality sentence embeddings.
2662
+ We offer an `encode` function to deal with this.
2663
+
2664
+ However, if you would like to do it without using the default `encode` function:
2665
+
2666
+ ```python
2667
+ import torch
2668
+ import torch.nn.functional as F
2669
+ from transformers import AutoTokenizer, AutoModel
2670
+
2671
+ def mean_pooling(model_output, attention_mask):
2672
+ token_embeddings = model_output[0]
2673
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
2674
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
2675
+
2676
+ sentences = ['How is the weather today?', 'What is the current weather like today?']
2677
+
2678
+ tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-small-en')
2679
+ model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-small-en', trust_remote_code=True)
2680
+
2681
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
2682
+
2683
+ with torch.no_grad():
2684
+ model_output = model(**encoded_input)
2685
+
2686
+ embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
2687
+ embeddings = F.normalize(embeddings, p=2, dim=1)
2688
+ ```
2689
+
2690
+ </p>
2691
+ </details>
2692
+
2693
  You can use Jina Embedding models directly from transformers package:
2694
  ```python
2695
  !pip install transformers
 
2717
 
2718
  ## Use Jina Embeddings for RAG
2719
 
2720
+ According to the latest blog post from [LLama Index](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83),
2721
+
2722
+ > In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out.
2723
 
2724
  <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
2725
 
 
2746
  archivePrefix={arXiv},
2747
  primaryClass={cs.CL}
2748
  }
2749
+ ```