docs: refine README for model releasing

#2
by jemfu - opened
Files changed (1) hide show
  1. README.md +61 -16
README.md CHANGED
@@ -35,7 +35,7 @@ The model is also equipped with a flash attention mechanism, which significantly
35
 
36
  # Usage
37
 
38
- 1. The easiest way to starting using `jina-reranker-v2-base-multilingual` is to use Jina AI's [Reranker API](https://jina.ai/reranker/).
39
 
40
  ```bash
41
  curl https://api.jina.ai/v1/rerank \
@@ -62,30 +62,38 @@ curl https://api.jina.ai/v1/rerank \
62
 
63
  2. You can also use the `transformers` library to interact with the model programmatically.
64
 
 
 
 
 
 
 
 
65
  ```python
66
- !pip install transformers
67
  from transformers import AutoModelForSequenceClassification
68
 
69
  model = AutoModelForSequenceClassification.from_pretrained(
70
  'jinaai/jina-reranker-v2-base-multilingual',
71
- device_map="cuda",
72
  torch_dtype="auto",
73
  trust_remote_code=True,
74
  )
75
 
 
 
 
76
  # Example query and documents
77
  query = "Organic skincare products for sensitive skin"
78
  documents = [
79
- "Eco-friendly kitchenware for modern homes",
80
- "Biodegradable cleaning supplies for eco-conscious consumers",
81
- "Organic cotton baby clothes for sensitive skin",
82
- "Natural organic skincare range for sensitive skin",
83
- "Tech gadgets for smart homes: 2024 edition",
84
- "Sustainable gardening tools and compost solutions",
85
- "Sensitive skin-friendly facial cleansers and toners",
86
- "Organic food wraps and storage solutions",
87
- "All-natural pet food for dogs with allergies",
88
- "Yoga mats made from recycled materials"
89
  ]
90
 
91
  # construct sentence pairs
@@ -94,9 +102,30 @@ sentence_pairs = [[query, doc] for doc in documents]
94
  scores = model.compute_score(sentence_pairs, max_length=1024)
95
  ```
96
 
97
- That's it! You can now use the `jina-reranker-v2-base-multilingual` model in your projects.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
 
99
- Note that by default, the `jina-reranker-v2-base-multilingual` model uses [flash attention](https://github.com/Dao-AILab/flash-attention), which requires certain types of GPU hardware to run. If you encounter any issues, you can try call `AutoModelForSequenceClassification.from_pretrained()` with `use_flash_attn=False`. This will use the standard attention mechanism instead of flash attention. You can also try running the model on a CPU by setting `device_map="cpu"`.
100
 
101
 
102
  In addition to the `compute_score()` function, the `jina-reranker-v2-base-multilingual` model also provides a `model.rerank()` function that can be used to rerank documents based on a query. You can use it as follows:
@@ -113,5 +142,21 @@ result = model.rerank(
113
 
114
  Inside the `result` object, you will find the reranked documents along with their scores. You can use this information to further process the documents as needed.
115
 
116
- What's more, the `rerank()` function will automatically chunk the input documents into smaller pieces if they exceed the model's maximum input length. This allows you to rerank long documents without running into memory issues.
117
  Specifically, the `rerank()` function will split the documents into chunks of size `max_length` and rerank each chunk separately. The scores from all the chunks are then combined to produce the final reranking results. You can control the query length and document length in each chunk by setting the `max_query_length` and `max_length` parameters. The `rerank()` function also supports the `overlap` parameter (default is `80`) which determines how much overlap there is between adjacent chunks. This can be useful when reranking long documents to ensure that the model has enough context to make accurate predictions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  # Usage
37
 
38
+ 1. The easiest way to use `jina-reranker-v2-base-multilingual` is to call Jina AI's [Reranker API](https://jina.ai/reranker/).
39
 
40
  ```bash
41
  curl https://api.jina.ai/v1/rerank \
 
62
 
63
  2. You can also use the `transformers` library to interact with the model programmatically.
64
 
65
+ Before you start, install the `transformers` and `einops` libraries:
66
+
67
+ ```bash
68
+ pip install transformers einops
69
+ ```
70
+
71
+ And then:
72
  ```python
 
73
  from transformers import AutoModelForSequenceClassification
74
 
75
  model = AutoModelForSequenceClassification.from_pretrained(
76
  'jinaai/jina-reranker-v2-base-multilingual',
 
77
  torch_dtype="auto",
78
  trust_remote_code=True,
79
  )
80
 
81
+ model.to('cuda') # or 'cpu' if no GPU is available
82
+ model.eval()
83
+
84
  # Example query and documents
85
  query = "Organic skincare products for sensitive skin"
86
  documents = [
87
+ "Organic skincare for sensitive skin with aloe vera and chamomile.",
88
+ "New makeup trends focus on bold colors and innovative techniques",
89
+ "Bio-Hautpflege für empfindliche Haut mit Aloe Vera und Kamille",
90
+ "Neue Make-up-Trends setzen auf kräftige Farben und innovative Techniken",
91
+ "Cuidado de la piel orgánico para piel sensible con aloe vera y manzanilla",
92
+ "Las nuevas tendencias de maquillaje se centran en colores vivos y técnicas innovadoras",
93
+ "针对敏感肌专门设计的天然有机护肤产品",
94
+ "新的化妆趋势注重鲜艳的颜色和创新的技巧",
95
+ "敏感肌のために特別に設計された天然有機スキンケア製品",
96
+ "新しいメイクのトレンドは鮮やかな色と革新的な技術に焦点を当てています",
97
  ]
98
 
99
  # construct sentence pairs
 
102
  scores = model.compute_score(sentence_pairs, max_length=1024)
103
  ```
104
 
105
+ The scores will be a list of floats, where each float represents the relevance score of the corresponding document to the query. Higher scores indicate higher relevance.
106
+ For instance the returning scores in this case will be:
107
+ ```bash
108
+ [0.8311430811882019, 0.09401018172502518,
109
+ 0.6334102749824524, 0.08269733935594559,
110
+ 0.7620701193809509, 0.09947021305561066,
111
+ 0.9263036847114563, 0.05834583938121796,
112
+ 0.8418256044387817, 0.11124119907617569]
113
+ ```
114
+
115
+ The model gives high relevance scores to the documents that are most relevant to the query regardless of the language of the document.
116
+
117
+ Note that by default, the `jina-reranker-v2-base-multilingual` model uses [flash attention](https://github.com/Dao-AILab/flash-attention), which requires certain types of GPU hardware to run.
118
+ If you encounter any issues, you can try call `AutoModelForSequenceClassification.from_pretrained()` with `use_flash_attn=False`.
119
+ This will use the standard attention mechanism instead of flash attention.
120
+
121
+ If you want to use flash attention for fast inference, you need to install the following packages:
122
+ ```bash
123
+ pip install ninja # required for flash attention
124
+ pip install flash-attn --no-build-isolation
125
+ ```
126
+ Enjoy the 3x-6x speedup with flash attention! ⚡️⚡️⚡️
127
 
128
+ That's it! You can now use the `jina-reranker-v2-base-multilingual` model in your projects.
129
 
130
 
131
  In addition to the `compute_score()` function, the `jina-reranker-v2-base-multilingual` model also provides a `model.rerank()` function that can be used to rerank documents based on a query. You can use it as follows:
 
142
 
143
  Inside the `result` object, you will find the reranked documents along with their scores. You can use this information to further process the documents as needed.
144
 
145
+ The `rerank()` function will automatically chunk the input documents into smaller pieces if they exceed the model's maximum input length. This allows you to rerank long documents without running into memory issues.
146
  Specifically, the `rerank()` function will split the documents into chunks of size `max_length` and rerank each chunk separately. The scores from all the chunks are then combined to produce the final reranking results. You can control the query length and document length in each chunk by setting the `max_query_length` and `max_length` parameters. The `rerank()` function also supports the `overlap` parameter (default is `80`) which determines how much overlap there is between adjacent chunks. This can be useful when reranking long documents to ensure that the model has enough context to make accurate predictions.
147
+
148
+
149
+ # Evaluation
150
+
151
+ We evaluated Jina Reranker v2 on multiple benchmarks to ensure top-tier performance and search relevance.
152
+
153
+ | Model Name | Miracl(nDCG@10, 18 langs) | MKQA(nDCG@10, 26 langs) | BEIR(nDCG@10, 17 datasets) | MLDR(recall@10, 13 langs) | CodeSearchNet (MRR@10, 3 tasks) | AirBench (nDCG@10, zh/en) | ToolBench (recall@3, 3 tasks) | TableSearch (recall@3) |
154
+ |:-----------------------------: |:-------------------------: |------------------------- |---------------------------- |--------------------------- |--------------------------------- |--------------------------- |------------------------------- |------------------------ |
155
+ | jina-reranker-v2-multilingual | 62.14 | 54.83 | 53.17 | 68.95 | 71.36 | 61.33 | 77.75 | 93.31 |
156
+ | bge-reranker-v2-m3 | 63.43 | 54.17 | 53.65 | 59.73 | 62.86 | 61.28 | 78.46 | 74.86 |
157
+ | mmarco-mMiniLMv2-L12-H384-v1 | 59.71 | 53.37 | 45.40 | 28.91 | 51.78 | 56.46 | 58.39 | 53.60 |
158
+ | jina-reranker-v1-base-en | - | - | 52.45 | - | - | - | 74.13 | 72.89 |
159
+
160
+ Note:
161
+ - NDCG@10 and MRR@10 measure ranking quality, with higher scores indicating better search results
162
+ - recall@3 measures the proportion of relevant documents retrieved, with higher scores indicating better search results