Commit
·
30aaf52
1
Parent(s):
2579c47
Update README.md
Browse files
README.md
CHANGED
@@ -20,10 +20,17 @@ license: apache-2.0
|
|
20 |
</a>
|
21 |
</p>
|
22 |
|
|
|
23 |
<p align="left">
|
24 |
<a href="https://github.com/netease-youdao/BCEmbedding">GitHub</a>
|
25 |
</p>
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
<details open="open">
|
28 |
<summary>Click to Open Contents</summary>
|
29 |
|
@@ -33,7 +40,8 @@ license: apache-2.0
|
|
33 |
- <a href="#-model-list" target="_Self">🍎 Model List</a>
|
34 |
- <a href="#-manual" target="_Self">📖 Manual</a>
|
35 |
- <a href="#installation" target="_Self">Installation</a>
|
36 |
-
- <a href="#quick-start" target="_Self">Quick Start</a>
|
|
|
37 |
- <a href="#%EF%B8%8F-evaluation" target="_Self">⚙️ Evaluation</a>
|
38 |
- <a href="#evaluate-semantic-representation-by-mteb" target="_Self">Evaluate Semantic Representation by MTEB</a>
|
39 |
- <a href="#evaluate-rag-by-llamaindex" target="_Self">Evaluate RAG by LlamaIndex</a>
|
@@ -127,17 +135,20 @@ Existing embedding models often encounter performance challenges in bilingual an
|
|
127 |
### Installation
|
128 |
|
129 |
First, create a conda environment and activate it.
|
|
|
130 |
```bash
|
131 |
conda create --name bce python=3.10 -y
|
132 |
conda activate bce
|
133 |
```
|
134 |
|
135 |
-
Then install `BCEmbedding
|
|
|
136 |
```bash
|
137 |
-
pip install
|
138 |
```
|
139 |
|
140 |
Or install from source:
|
|
|
141 |
```bash
|
142 |
git clone git@github.com:netease-youdao/BCEmbedding.git
|
143 |
cd BCEmbedding
|
@@ -146,7 +157,9 @@ pip install -v -e .
|
|
146 |
|
147 |
### Quick Start
|
148 |
|
149 |
-
|
|
|
|
|
150 |
|
151 |
```python
|
152 |
from BCEmbedding import EmbeddingModel
|
@@ -161,7 +174,7 @@ model = EmbeddingModel(model_name_or_path="maidalun1020/bce-embedding-base_v1")
|
|
161 |
embeddings = model.encode(sentences)
|
162 |
```
|
163 |
|
164 |
-
Use `RerankerModel`
|
165 |
|
166 |
```python
|
167 |
from BCEmbedding import RerankerModel
|
@@ -183,6 +196,164 @@ scores = model.compute_score(sentence_pairs)
|
|
183 |
rerank_results = model.rerank(query, passages)
|
184 |
```
|
185 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
186 |
## ⚙️ Evaluation
|
187 |
|
188 |
### Evaluate Semantic Representation by MTEB
|
@@ -193,9 +364,9 @@ We provide evaluateion tools for `embedding` and `reranker` models, based on [MT
|
|
193 |
|
194 |
#### 1. Embedding Models
|
195 |
|
196 |
-
Just run following cmd to evaluate `your_embedding_model` (e.g. `maidalun1020/bce-embedding-base_v1`) in **
|
197 |
|
198 |
-
运行下面命令评测`your_embedding_model`(比如,`maidalun1020/bce-embedding-base_v1
|
199 |
|
200 |
```bash
|
201 |
python BCEmbedding/tools/eval_mteb/eval_embedding_mteb.py --model_name_or_path maidalun1020/bce-embedding-base_v1 --pooler cls
|
@@ -206,8 +377,11 @@ The total evaluation tasks contain ***114 datastes*** of **"Retrieval", "STS", "
|
|
206 |
评测包含 **"Retrieval", "STS", "PairClassification", "Classification", "Reranking"和"Clustering"** 这六大类任务的 ***114个数据集***。
|
207 |
|
208 |
***NOTE:***
|
209 |
-
- All models are evaluated in their
|
|
|
|
|
210 |
- "jina-embeddings-v2-base-en" model should be loaded with `trust_remote_code`.
|
|
|
211 |
```bash
|
212 |
python BCEmbedding/tools/eval_mteb/eval_embedding_mteb.py --model_name_or_path {moka-ai/m3e-base | moka-ai/m3e-large} --pooler mean
|
213 |
|
@@ -215,14 +389,14 @@ python BCEmbedding/tools/eval_mteb/eval_embedding_mteb.py --model_name_or_path j
|
|
215 |
```
|
216 |
|
217 |
***注意:***
|
218 |
-
- 所有模型的评测采用各自推荐的`pooler`。"jina-embeddings-v2-base-en"
|
219 |
- "jina-embeddings-v2-base-en"模型在载入时需要`trust_remote_code`。
|
220 |
|
221 |
#### 2. Reranker Models
|
222 |
|
223 |
-
Run following cmd to evaluate `your_reranker_model` (e.g. "maidalun1020/bce-reranker-base_v1") in **
|
224 |
|
225 |
-
运行下面命令评测`your_reranker_model`(比如,`maidalun1020/bce-reranker-base_v1
|
226 |
|
227 |
```bash
|
228 |
python BCEmbedding/tools/eval_mteb/eval_reranker_mteb.py --model_name_or_path maidalun1020/bce-reranker-base_v1
|
@@ -323,25 +497,30 @@ The summary of multiple domains evaluations can be seen in <a href=#1-multiple-d
|
|
323 |
|
324 |
#### 1. Embedding Models
|
325 |
|
326 |
-
| Model | Retrieval | STS | PairClassification | Classification | Reranking | Clustering |
|
327 |
-
|
328 |
-
| bge-base-en-v1.5 | 37.14 | 55.06 | 75.45 | 59.73 | 43.
|
329 |
-
| bge-base-zh-v1.5 | 47.
|
330 |
-
| bge-large-en-v1.5 | 37.
|
331 |
-
| bge-large-zh-v1.5 | 47.
|
332 |
-
|
|
333 |
-
|
|
334 |
-
|
|
335 |
-
|
|
|
|
|
|
|
|
|
|
|
|
336 |
|
337 |
***NOTE:***
|
338 |
-
- Our ***bce-embedding-base_v1*** outperforms other opensource embedding models with
|
339 |
- ***114 datastes*** of **"Retrieval", "STS", "PairClassification", "Classification", "Reranking" and "Clustering"** in `["en", "zh", "en-zh", "zh-en"]` setting.
|
340 |
- The [crosslingual evaluation datasets](https://github.com/netease-youdao/BCEmbedding/blob/master/BCEmbedding/evaluation/c_mteb/Retrieval.py) we released belong to `Retrieval` task.
|
341 |
- More evaluation details please check [Embedding Models Evaluation Summary](https://github.com/netease-youdao/BCEmbedding/blob/master/Docs/EvaluationSummary/embedding_eval_summary.md).
|
342 |
|
343 |
***要点:***
|
344 |
-
-
|
345 |
- 评测包含 **"Retrieval", "STS", "PairClassification", "Classification", "Reranking"和"Clustering"** 这六大类任务的共 ***114个数据集***。
|
346 |
- 我们开源的[跨语种语义表征评测数据](https://github.com/netease-youdao/BCEmbedding/blob/master/BCEmbedding/evaluation/c_mteb/Retrieval.py)属于`Retrieval`任务。
|
347 |
- 更详细的评测结果详见[Embedding模型指标汇总](https://github.com/netease-youdao/BCEmbedding/blob/master/Docs/EvaluationSummary/embedding_eval_summary.md)。
|
@@ -368,16 +547,8 @@ The summary of multiple domains evaluations can be seen in <a href=#1-multiple-d
|
|
368 |
|
369 |
#### 1. Multiple Domains Scenarios
|
370 |
|
371 |
-
|
372 |
-
|
373 |
-
| OpenAI-ada-2 | 81.04/57.35 | 88.35/67.83 | 88.89/69.64 | **90.71/75.46** |
|
374 |
-
| bge-large-en-v1.5 | 52.67/34.69 | 64.59/52.11 | 64.71/52.05 | **65.36/55.50** |
|
375 |
-
| bge-large-zh-v1.5 | 69.81/47.38 | 79.37/62.13 | 80.11/63.95 | **81.19/68.50** |
|
376 |
-
| llm-embedder | 50.85/33.26 | 63.62/51.45 | 63.54/51.32 | **64.47/54.98** |
|
377 |
-
| CohereV3-en | 53.10/35.39 | 65.75/52.80 | 66.29/53.31 | **66.91/56.93** |
|
378 |
-
| CohereV3-multilingual | 79.80/57.22 | 86.34/66.62 | 86.76/68.56 | **88.35/73.73** |
|
379 |
-
| JinaAI-v2-Base-en | 50.27/32.31 | 63.97/51.10 | 64.28/51.83 | **64.82/54.98** |
|
380 |
-
| ***bce-embedding-base_v1*** | **85.91/62.36** | **91.25/69.38** | **91.80/71.13** | ***93.46/77.02*** |
|
381 |
|
382 |
***NOTE:***
|
383 |
- In `WithoutReranker` setting, our `bce-embedding-base_v1` outperforms all the other embedding models.
|
@@ -401,7 +572,8 @@ Welcome to scan the QR code below and join the WeChat group.
|
|
401 |
|
402 |
欢迎大家扫码加入官方微信交流群。
|
403 |
|
404 |
-
|
|
|
405 |
|
406 |
## ✏️ Citation
|
407 |
|
|
|
20 |
</a>
|
21 |
</p>
|
22 |
|
23 |
+
最新bce-embedding-base_v1相关信息,以及更多MTEB和RAG相关评测细节,请移步:
|
24 |
<p align="left">
|
25 |
<a href="https://github.com/netease-youdao/BCEmbedding">GitHub</a>
|
26 |
</p>
|
27 |
|
28 |
+
主要特点:
|
29 |
+
1、中英双语,以及中英跨语种能力;
|
30 |
+
2、RAG优化,适配更多真实业务场景;
|
31 |
+
3、方便集成进langchain和llamaindex。
|
32 |
+
|
33 |
+
-----------------------------------------
|
34 |
<details open="open">
|
35 |
<summary>Click to Open Contents</summary>
|
36 |
|
|
|
40 |
- <a href="#-model-list" target="_Self">🍎 Model List</a>
|
41 |
- <a href="#-manual" target="_Self">📖 Manual</a>
|
42 |
- <a href="#installation" target="_Self">Installation</a>
|
43 |
+
- <a href="#quick-start" target="_Self">Quick Start (`transformers`, `sentence-transformers`)</a>
|
44 |
+
- <a href="#integrations-for-rag-frameworks" target="_Self">Integrations for RAG Frameworks (`langchain`, `llama_index`)</a>
|
45 |
- <a href="#%EF%B8%8F-evaluation" target="_Self">⚙️ Evaluation</a>
|
46 |
- <a href="#evaluate-semantic-representation-by-mteb" target="_Self">Evaluate Semantic Representation by MTEB</a>
|
47 |
- <a href="#evaluate-rag-by-llamaindex" target="_Self">Evaluate RAG by LlamaIndex</a>
|
|
|
135 |
### Installation
|
136 |
|
137 |
First, create a conda environment and activate it.
|
138 |
+
|
139 |
```bash
|
140 |
conda create --name bce python=3.10 -y
|
141 |
conda activate bce
|
142 |
```
|
143 |
|
144 |
+
Then install `BCEmbedding` for minimal installation:
|
145 |
+
|
146 |
```bash
|
147 |
+
pip install BCEmbedding==0.1.1
|
148 |
```
|
149 |
|
150 |
Or install from source:
|
151 |
+
|
152 |
```bash
|
153 |
git clone git@github.com:netease-youdao/BCEmbedding.git
|
154 |
cd BCEmbedding
|
|
|
157 |
|
158 |
### Quick Start
|
159 |
|
160 |
+
#### 1. Based on `BCEmbedding`
|
161 |
+
|
162 |
+
Use `EmbeddingModel`, and `cls` [pooler](./BCEmbedding/models/embedding.py#L24) is default.
|
163 |
|
164 |
```python
|
165 |
from BCEmbedding import EmbeddingModel
|
|
|
174 |
embeddings = model.encode(sentences)
|
175 |
```
|
176 |
|
177 |
+
Use `RerankerModel` to calculate relevant scores and rerank:
|
178 |
|
179 |
```python
|
180 |
from BCEmbedding import RerankerModel
|
|
|
196 |
rerank_results = model.rerank(query, passages)
|
197 |
```
|
198 |
|
199 |
+
NOTE:
|
200 |
+
|
201 |
+
- In [`RerankerModel.rerank`](./BCEmbedding/models/reranker.py#L137) method, we provide an advanced preproccess that we use in production for making `sentence_pairs`, when "passages" are very long.
|
202 |
+
|
203 |
+
#### 2. Based on `transformers`
|
204 |
+
|
205 |
+
For `EmbeddingModel`:
|
206 |
+
|
207 |
+
```python
|
208 |
+
from transformers import AutoModel, AutoTokenizer
|
209 |
+
|
210 |
+
# list of sentences
|
211 |
+
sentences = ['sentence_0', 'sentence_1', ...]
|
212 |
+
|
213 |
+
# init model and tokenizer
|
214 |
+
tokenizer = AutoTokenizer.from_pretrained('maidalun1020/bce-embedding-base_v1')
|
215 |
+
model = AutoModel.from_pretrained('maidalun1020/bce-embedding-base_v1')
|
216 |
+
|
217 |
+
device = 'cuda' # if no GPU, set "cpu"
|
218 |
+
model.to(device)
|
219 |
+
|
220 |
+
# get inputs
|
221 |
+
inputs = tokenizer(sentences, padding=True, truncation=True, max_length=512, return_tensors="pt")
|
222 |
+
inputs_on_device = {k: v.to(self.device) for k, v in inputs.items()}
|
223 |
+
|
224 |
+
# get embeddings
|
225 |
+
outputs = model(**inputs_on_device, return_dict=True)
|
226 |
+
embeddings = outputs.last_hidden_state[:, 0] # cls pooler
|
227 |
+
embeddings = embeddings / embeddings.norm(dim=1, keepdim=True) # normalize
|
228 |
+
```
|
229 |
+
|
230 |
+
For `RerankerModel`:
|
231 |
+
|
232 |
+
```python
|
233 |
+
import torch
|
234 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
235 |
+
|
236 |
+
# init model and tokenizer
|
237 |
+
tokenizer = AutoTokenizer.from_pretrained('maidalun1020/bce-reranker-base_v1')
|
238 |
+
model = AutoModelForSequenceClassification.from_pretrained('maidalun1020/bce-reranker-base_v1')
|
239 |
+
|
240 |
+
device = 'cuda' # if no GPU, set "cpu"
|
241 |
+
model.to(device)
|
242 |
+
|
243 |
+
# get inputs
|
244 |
+
inputs = tokenizer(sentence_pairs, padding=True, truncation=True, max_length=512, return_tensors="pt")
|
245 |
+
inputs_on_device = {k: v.to(device) for k, v in inputs.items()}
|
246 |
+
|
247 |
+
# calculate scores
|
248 |
+
scores = model(**inputs_on_device, return_dict=True).logits.view(-1,).float()
|
249 |
+
scores = torch.sigmoid(scores)
|
250 |
+
```
|
251 |
+
|
252 |
+
#### 3. Based on `sentence_transformers`
|
253 |
+
|
254 |
+
For `EmbeddingModel`:
|
255 |
+
|
256 |
+
```python
|
257 |
+
from sentence_transformers import SentenceTransformer
|
258 |
+
|
259 |
+
# list of sentences
|
260 |
+
sentences = ['sentence_0', 'sentence_1', ...]
|
261 |
+
|
262 |
+
# init embedding model
|
263 |
+
## New update for sentence-trnasformers. So clean up your "`SENTENCE_TRANSFORMERS_HOME`/maidalun1020_bce-embedding-base_v1" or "~/.cache/torch/sentence_transformers/maidalun1020_bce-embedding-base_v1" first for downloading new version.
|
264 |
+
model = SentenceTransformer("maidalun1020/bce-embedding-base_v1")
|
265 |
+
|
266 |
+
# extract embeddings
|
267 |
+
embeddings = model.encode(sentences, normalize_embeddings=True)
|
268 |
+
```
|
269 |
+
|
270 |
+
For `RerankerModel`:
|
271 |
+
|
272 |
+
```python
|
273 |
+
from sentence_transformers import CrossEncoder
|
274 |
+
|
275 |
+
# init reranker model
|
276 |
+
model = CrossEncoder('maidalun1020/bce-reranker-base_v1', max_length=512)
|
277 |
+
|
278 |
+
# calculate scores of sentence pairs
|
279 |
+
scores = model.predict(sentence_pairs)
|
280 |
+
```
|
281 |
+
|
282 |
+
### Integrations for RAG Frameworks
|
283 |
+
|
284 |
+
#### 1. Used in `langchain`
|
285 |
+
|
286 |
+
```python
|
287 |
+
from langchain.embeddings import HuggingFaceEmbeddings
|
288 |
+
from langchain_community.vectorstores import FAISS
|
289 |
+
from langchain_community.vectorstores.utils import DistanceStrategy
|
290 |
+
|
291 |
+
query = 'apples'
|
292 |
+
passages = [
|
293 |
+
'I like apples',
|
294 |
+
'I like oranges',
|
295 |
+
'Apples and oranges are fruits'
|
296 |
+
]
|
297 |
+
|
298 |
+
# init embedding model
|
299 |
+
model_name = 'maidalun1020/bce-embedding-base_v1'
|
300 |
+
model_kwargs = {'device': 'cuda'}
|
301 |
+
encode_kwargs = {'batch_size': 64, 'normalize_embeddings': True, 'show_progress_bar': False}
|
302 |
+
|
303 |
+
embed_model = HuggingFaceEmbeddings(
|
304 |
+
model_name=model_name,
|
305 |
+
model_kwargs=model_kwargs,
|
306 |
+
encode_kwargs=encode_kwargs
|
307 |
+
)
|
308 |
+
|
309 |
+
# example #1. extract embeddings
|
310 |
+
query_embedding = embed_model.embed_query(query)
|
311 |
+
passages_embeddings = embed_model.embed_documents(passages)
|
312 |
+
|
313 |
+
# example #2. langchain retriever example
|
314 |
+
faiss_vectorstore = FAISS.from_texts(passages, embed_model, distance_strategy=DistanceStrategy.MAX_INNER_PRODUCT)
|
315 |
+
|
316 |
+
retriever = faiss_vectorstore.as_retriever(search_type="similarity", search_kwargs={"score_threshold": 0.5, "k": 3})
|
317 |
+
|
318 |
+
related_passages = retriever.get_relevant_documents(query)
|
319 |
+
```
|
320 |
+
|
321 |
+
#### 2. Used in `llama_index`
|
322 |
+
|
323 |
+
```python
|
324 |
+
from llama_index.embeddings import HuggingFaceEmbedding
|
325 |
+
from llama_index import VectorStoreIndex, ServiceContext, SimpleDirectoryReader
|
326 |
+
from llama_index.node_parser import SimpleNodeParser
|
327 |
+
from llama_index.llms import OpenAI
|
328 |
+
|
329 |
+
query = 'apples'
|
330 |
+
passages = [
|
331 |
+
'I like apples',
|
332 |
+
'I like oranges',
|
333 |
+
'Apples and oranges are fruits'
|
334 |
+
]
|
335 |
+
|
336 |
+
# init embedding model
|
337 |
+
model_args = {'model_name': 'maidalun1020/bce-embedding-base_v1', 'max_length': 512, 'embed_batch_size': 64, 'device': 'cuda'}
|
338 |
+
embed_model = HuggingFaceEmbedding(**model_args)
|
339 |
+
|
340 |
+
# example #1. extract embeddings
|
341 |
+
query_embedding = embed_model.get_query_embedding(query)
|
342 |
+
passages_embeddings = embed_model.get_text_embedding_batch(passages)
|
343 |
+
|
344 |
+
# example #2. rag example
|
345 |
+
llm = OpenAI(model='gpt-3.5-turbo-0613', api_key=os.environ.get('OPENAI_API_KEY'), api_base=os.environ.get('OPENAI_BASE_URL'))
|
346 |
+
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
|
347 |
+
|
348 |
+
documents = SimpleDirectoryReader(input_files=["BCEmbedding/tools/eval_rag/eval_pdfs/Comp_en_llama2.pdf"]).load_data()
|
349 |
+
node_parser = SimpleNodeParser.from_defaults(chunk_size=512)
|
350 |
+
nodes = node_parser.get_nodes_from_documents(documents[0:36])
|
351 |
+
index = VectorStoreIndex(nodes, service_context=service_context)
|
352 |
+
query_engine = index.as_query_engine()
|
353 |
+
response = query_engine.query("What is llama?")
|
354 |
+
```
|
355 |
+
|
356 |
+
|
357 |
## ⚙️ Evaluation
|
358 |
|
359 |
### Evaluate Semantic Representation by MTEB
|
|
|
364 |
|
365 |
#### 1. Embedding Models
|
366 |
|
367 |
+
Just run following cmd to evaluate `your_embedding_model` (e.g. `maidalun1020/bce-embedding-base_v1`) in **bilingual and crosslingual settings** (e.g. `["en", "zh", "en-zh", "zh-en"]`).
|
368 |
|
369 |
+
运行下面命令评测`your_embedding_model`(比如,`maidalun1020/bce-embedding-base_v1`)。评测任务将会在**双语和跨语种**(比如,`["en", "zh", "en-zh", "zh-en"]`)模式下评测:
|
370 |
|
371 |
```bash
|
372 |
python BCEmbedding/tools/eval_mteb/eval_embedding_mteb.py --model_name_or_path maidalun1020/bce-embedding-base_v1 --pooler cls
|
|
|
377 |
评测包含 **"Retrieval", "STS", "PairClassification", "Classification", "Reranking"和"Clustering"** 这六大类任务的 ***114个数据集***。
|
378 |
|
379 |
***NOTE:***
|
380 |
+
- **All models are evaluated in their recommended pooling method (`pooler`)**.
|
381 |
+
- `mean` pooler: "jina-embeddings-v2-base-en", "m3e-base", "m3e-large", "e5-large-v2", "multilingual-e5-base", "multilingual-e5-large" and "gte-large".
|
382 |
+
- `cls` pooler: Other models.
|
383 |
- "jina-embeddings-v2-base-en" model should be loaded with `trust_remote_code`.
|
384 |
+
|
385 |
```bash
|
386 |
python BCEmbedding/tools/eval_mteb/eval_embedding_mteb.py --model_name_or_path {moka-ai/m3e-base | moka-ai/m3e-large} --pooler mean
|
387 |
|
|
|
389 |
```
|
390 |
|
391 |
***注意:***
|
392 |
+
- 所有模型的评测采用各自推荐的`pooler`。"jina-embeddings-v2-base-en", "m3e-base", "m3e-large", "e5-large-v2", "multilingual-e5-base", "multilingual-e5-large"和"gte-large"的 `pooler`采用`mean`,其他模型的`pooler`采用`cls`.
|
393 |
- "jina-embeddings-v2-base-en"模型在载入时需要`trust_remote_code`。
|
394 |
|
395 |
#### 2. Reranker Models
|
396 |
|
397 |
+
Run following cmd to evaluate `your_reranker_model` (e.g. "maidalun1020/bce-reranker-base_v1") in **bilingual and crosslingual settings** (e.g. `["en", "zh", "en-zh", "zh-en"]`).
|
398 |
|
399 |
+
运行下面命令评测`your_reranker_model`(比如,`maidalun1020/bce-reranker-base_v1`)。评测任务将会在 **双语种和跨语种**(比如,`["en", "zh", "en-zh", "zh-en"]`)模式下评测:
|
400 |
|
401 |
```bash
|
402 |
python BCEmbedding/tools/eval_mteb/eval_reranker_mteb.py --model_name_or_path maidalun1020/bce-reranker-base_v1
|
|
|
497 |
|
498 |
#### 1. Embedding Models
|
499 |
|
500 |
+
| Model | Dimensions | Pooler | Instructions | Retrieval (47) | STS (19) | PairClassification (5) | Classification (21) | Reranking (12) | Clustering (15) | ***AVG*** (119) |
|
501 |
+
|:--------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
|
502 |
+
| bge-base-en-v1.5 | 768 | `cls` | Need | 37.14 | 55.06 | 75.45 | 59.73 | 43.00 | 37.74 | 47.19 |
|
503 |
+
| bge-base-zh-v1.5 | 768 | `cls` | Need | 47.63 | 63.72 | 77.40 | 63.38 | 54.95 | 32.56 | 53.62 |
|
504 |
+
| bge-large-en-v1.5 | 1024 | `cls` | Need | 37.18 | 54.09 | 75.00 | 59.24 | 42.47 | 37.32 | 46.80 |
|
505 |
+
| bge-large-zh-v1.5 | 1024 | `cls` | Need | 47.58 | 64.73 | 79.14 | 64.19 | 55.98 | 33.26 | 54.23 |
|
506 |
+
| e5-large-v2 | 1024 | `mean` | Need | 35.98 | 55.23 | 75.28 | 59.53 | 42.12 | 36.51 | 46.52 |
|
507 |
+
| gte-large | 1024 | `mean` | Free | 36.68 | 55.22 | 74.29 | 57.73 | 42.44 | 38.51 | 46.67 |
|
508 |
+
| gte-large-zh | 1024 | `cls` | Free | 41.15 | 64.62 | 77.58 | 62.04 | 55.62 | 33.03 | 51.51 |
|
509 |
+
| jina-embeddings-v2-base-en | 768 | `mean` | Free | 31.58 | 54.28 | 74.84 | 58.42 | 41.16 | 34.67 | 44.29 |
|
510 |
+
| m3e-base | 768 | `mean` | Free | 46.29 | 63.93 | 71.84 | 64.08 | 52.38 | 37.84 | 53.54 |
|
511 |
+
| m3e-large | 1024 | `mean` | Free | 34.85 | 59.74 | 67.69 | 60.07 | 48.99 | 31.62 | 46.78 |
|
512 |
+
| multilingual-e5-base | 768 | `mean` | Need | 54.73 | 65.49 | 76.97 | 69.72 | 55.01 | 38.44 | 58.34 |
|
513 |
+
| multilingual-e5-large | 1024 | `mean` | Need | 56.76 | 66.79 | 78.80 | 71.61 | 56.49 | 43.09 | 60.50 |
|
514 |
+
| ***bce-embedding-base_v1*** | 768 | `cls` | Free | 57.60 | 65.73 | 74.96 | 69.00 | 57.29 | 38.95 | 59.43 |
|
515 |
|
516 |
***NOTE:***
|
517 |
+
- Our ***bce-embedding-base_v1*** outperforms other opensource embedding models with comparable model size.
|
518 |
- ***114 datastes*** of **"Retrieval", "STS", "PairClassification", "Classification", "Reranking" and "Clustering"** in `["en", "zh", "en-zh", "zh-en"]` setting.
|
519 |
- The [crosslingual evaluation datasets](https://github.com/netease-youdao/BCEmbedding/blob/master/BCEmbedding/evaluation/c_mteb/Retrieval.py) we released belong to `Retrieval` task.
|
520 |
- More evaluation details please check [Embedding Models Evaluation Summary](https://github.com/netease-youdao/BCEmbedding/blob/master/Docs/EvaluationSummary/embedding_eval_summary.md).
|
521 |
|
522 |
***要点:***
|
523 |
+
- 对比���他开源的相同规模的embedding模型,***bce-embedding-base_v1*** 表现最好,效果比最好的large模型稍差。
|
524 |
- 评测包含 **"Retrieval", "STS", "PairClassification", "Classification", "Reranking"和"Clustering"** 这六大类任务的共 ***114个数据集***。
|
525 |
- 我们开源的[跨语种语义表征评测数据](https://github.com/netease-youdao/BCEmbedding/blob/master/BCEmbedding/evaluation/c_mteb/Retrieval.py)属于`Retrieval`任务。
|
526 |
- 更详细的评测结果详见[Embedding模型指标汇总](https://github.com/netease-youdao/BCEmbedding/blob/master/Docs/EvaluationSummary/embedding_eval_summary.md)。
|
|
|
547 |
|
548 |
#### 1. Multiple Domains Scenarios
|
549 |
|
550 |
+
|
551 |
+
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64745e955aba8edfb2ed561a/NyV_6ZrsaqUluUnxHKR_m.jpeg)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
552 |
|
553 |
***NOTE:***
|
554 |
- In `WithoutReranker` setting, our `bce-embedding-base_v1` outperforms all the other embedding models.
|
|
|
572 |
|
573 |
欢迎大家扫码加入官方微信交流群。
|
574 |
|
575 |
+
|
576 |
+
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64745e955aba8edfb2ed561a/mMlIkYn2qPXlivq4wtvyy.jpeg)
|
577 |
|
578 |
## ✏️ Citation
|
579 |
|