maidalun1020 commited on
Commit
24e934d
1 Parent(s): 3b18090

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -21
README.md CHANGED
@@ -20,10 +20,17 @@ license: apache-2.0
20
  </a>
21
  </p>
22
 
 
23
  <p align="left">
24
  <a href="https://github.com/netease-youdao/BCEmbedding">GitHub</a>
25
  </p>
26
 
 
 
 
 
 
 
27
  <details open="open">
28
  <summary>Click to Open Contents</summary>
29
 
@@ -323,16 +330,21 @@ The summary of multiple domains evaluations can be seen in <a href=#1-multiple-d
323
 
324
  #### 1. Embedding Models
325
 
326
- | Model | Retrieval | STS | PairClassification | Classification | Reranking | Clustering | Avg |
327
- |:-------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
328
- | bge-base-en-v1.5 | 37.14 | 55.06 | 75.45 | 59.73 | 43.05 | 37.74 | 47.20 |
329
- | bge-base-zh-v1.5 | 47.60 | 63.72 | 77.40 | 63.38 | 54.85 | 32.56 | 53.60 |
330
- | bge-large-en-v1.5 | 37.15 | 54.09 | 75.00 | 59.24 | 42.68 | 37.32 | 46.82 |
331
- | bge-large-zh-v1.5 | 47.54 | 64.73 | **79.14** | 64.19 | 55.88 | 33.26 | 54.21 |
332
- | jina-embeddings-v2-base-en | 31.58 | 54.28 | 74.84 | 58.42 | 41.16 | 34.67 | 44.29 |
333
- | m3e-base | 46.29 | 63.93 | 71.84 | 64.08 | 52.38 | 37.84 | 53.54 |
334
- | m3e-large | 34.85 | 59.74 | 67.69 | 60.07 | 48.99 | 31.62 | 46.78 |
335
- | ***bce-embedding-base_v1*** | **57.60** | **65.73** | 74.96 | **69.00** | **57.29** | **38.95** | **59.43** |
 
 
 
 
 
336
 
337
  ***NOTE:***
338
  - Our ***bce-embedding-base_v1*** outperforms other opensource embedding models with various model size.
@@ -368,16 +380,8 @@ The summary of multiple domains evaluations can be seen in <a href=#1-multiple-d
368
 
369
  #### 1. Multiple Domains Scenarios
370
 
371
- | Embedding Models | WithoutReranker <br> [*hit_rate/mrr*] | CohereRerank <br> [*hit_rate/mrr*] | bge-reranker-large <br> [*hit_rate/mrr*] | ***bce-reranker-base_v1*** <br> [*hit_rate/mrr*] |
372
- |:-------------------------------|:--------:|:--------:|:--------:|:--------:|
373
- | OpenAI-ada-2 | 81.04/57.35 | 88.35/67.83 | 88.89/69.64 | **90.71/75.46** |
374
- | bge-large-en-v1.5 | 52.67/34.69 | 64.59/52.11 | 64.71/52.05 | **65.36/55.50** |
375
- | bge-large-zh-v1.5 | 69.81/47.38 | 79.37/62.13 | 80.11/63.95 | **81.19/68.50** |
376
- | llm-embedder | 50.85/33.26 | 63.62/51.45 | 63.54/51.32 | **64.47/54.98** |
377
- | CohereV3-en | 53.10/35.39 | 65.75/52.80 | 66.29/53.31 | **66.91/56.93** |
378
- | CohereV3-multilingual | 79.80/57.22 | 86.34/66.62 | 86.76/68.56 | **88.35/73.73** |
379
- | JinaAI-v2-Base-en | 50.27/32.31 | 63.97/51.10 | 64.28/51.83 | **64.82/54.98** |
380
- | ***bce-embedding-base_v1*** | **85.91/62.36** | **91.25/69.38** | **91.80/71.13** | ***93.46/77.02*** |
381
 
382
  ***NOTE:***
383
  - In `WithoutReranker` setting, our `bce-embedding-base_v1` outperforms all the other embedding models.
@@ -401,7 +405,8 @@ Welcome to scan the QR code below and join the WeChat group.
401
 
402
  欢迎大家扫码加入官方微信交流群。
403
 
404
- <img src="https://github.com/netease-youdao/BCEmbedding/blob/master/Docs/assets/Wechat.jpg" width="20%" height="auto">
 
405
 
406
  ## ✏️ Citation
407
 
 
20
  </a>
21
  </p>
22
 
23
+ 最新bce-reranker-base_v1相关信息,以及更多MTEB和RAG相关评测细节,请移步:
24
  <p align="left">
25
  <a href="https://github.com/netease-youdao/BCEmbedding">GitHub</a>
26
  </p>
27
 
28
+ 主要特点:
29
+ 1、中英日韩四个语种,以及中英日韩四个语种的跨语种能力;
30
+ 2、RAG优化,适配更多真实业务场景;
31
+ 3、适配长文本做rerank。
32
+
33
+ -----------------------------------------
34
  <details open="open">
35
  <summary>Click to Open Contents</summary>
36
 
 
330
 
331
  #### 1. Embedding Models
332
 
333
+ | Model | Dimensions | Pooler | Instructions | Retrieval (47) | STS (19) | PairClassification (5) | Classification (21) | Reranking (12) | Clustering (15) | ***AVG*** (119) |
334
+ |:--------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|
335
+ | bge-base-en-v1.5 | 768 | `cls` | Need | 37.14 | 55.06 | 75.45 | 59.73 | 43.00 | 37.74 | 47.19 |
336
+ | bge-base-zh-v1.5 | 768 | `cls` | Need | 47.63 | 63.72 | 77.40 | 63.38 | 54.95 | 32.56 | 53.62 |
337
+ | bge-large-en-v1.5 | 1024 | `cls` | Need | 37.18 | 54.09 | 75.00 | 59.24 | 42.47 | 37.32 | 46.80 |
338
+ | bge-large-zh-v1.5 | 1024 | `cls` | Need | 47.58 | 64.73 | 79.14 | 64.19 | 55.98 | 33.26 | 54.23 |
339
+ | e5-large-v2 | 1024 | `mean` | Need | 35.98 | 55.23 | 75.28 | 59.53 | 42.12 | 36.51 | 46.52 |
340
+ | gte-large | 1024 | `mean` | Free | 36.68 | 55.22 | 74.29 | 57.73 | 42.44 | 38.51 | 46.67 |
341
+ | gte-large-zh | 1024 | `cls` | Free | 41.15 | 64.62 | 77.58 | 62.04 | 55.62 | 33.03 | 51.51 |
342
+ | jina-embeddings-v2-base-en | 768 | `mean` | Free | 31.58 | 54.28 | 74.84 | 58.42 | 41.16 | 34.67 | 44.29 |
343
+ | m3e-base | 768 | `mean` | Free | 46.29 | 63.93 | 71.84 | 64.08 | 52.38 | 37.84 | 53.54 |
344
+ | m3e-large | 1024 | `mean` | Free | 34.85 | 59.74 | 67.69 | 60.07 | 48.99 | 31.62 | 46.78 |
345
+ | multilingual-e5-base | 768 | `mean` | Need | 54.73 | 65.49 | 76.97 | 69.72 | 55.01 | 38.44 | 58.34 |
346
+ | multilingual-e5-large | 1024 | `mean` | Need | 56.76 | 66.79 | 78.80 | 71.61 | 56.49 | 43.09 | 60.50 |
347
+ | ***bce-embedding-base_v1*** | 768 | `cls` | Free | 57.60 | 65.73 | 74.96 | 69.00 | 57.29 | 38.95 | 59.43 |
348
 
349
  ***NOTE:***
350
  - Our ***bce-embedding-base_v1*** outperforms other opensource embedding models with various model size.
 
380
 
381
  #### 1. Multiple Domains Scenarios
382
 
383
+
384
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64745e955aba8edfb2ed561a/NyV_6ZrsaqUluUnxHKR_m.jpeg)
 
 
 
 
 
 
 
 
385
 
386
  ***NOTE:***
387
  - In `WithoutReranker` setting, our `bce-embedding-base_v1` outperforms all the other embedding models.
 
405
 
406
  欢迎大家扫码加入官方微信交流群。
407
 
408
+
409
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64745e955aba8edfb2ed561a/mMlIkYn2qPXlivq4wtvyy.jpeg)
410
 
411
  ## ✏️ Citation
412