Upload folder using huggingface_hub
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ tags:
|
|
7 |
- mteb
|
8 |
license: apache-2.0
|
9 |
model-index:
|
10 |
-
- name: bge-en-
|
11 |
results:
|
12 |
- dataset:
|
13 |
config: en
|
@@ -2616,7 +2616,7 @@ print(scores.tolist())
|
|
2616 |
|
2617 |
## Evaluation
|
2618 |
|
2619 |
-
`bge-en-
|
2620 |
|
2621 |
- **MTEB**:
|
2622 |
|
@@ -2630,12 +2630,12 @@ print(scores.tolist())
|
|
2630 |
| **gte-Qwen2-7B-instruct** | 83.04 | 31.35 | 85.79 | 86.58 | **61.42** | 56.92 | 60.25 | 70.24 |
|
2631 |
| **stella_en_1.5B_v5** | 84.51 | **31.49** | 88.07 | 88.07 | 61.21 | 57.69 | 61.21 | 71.19 |
|
2632 |
| **bge-multilingual-gemma2** | 83.88 | 31.20 | 85.84 | 88.08 | 59.72 | 54.65 | 59.24 | 69.88 |
|
2633 |
-
| **bge-en-
|
2634 |
-
| **bge-en-
|
2635 |
|
2636 |
- **BEIR**:
|
2637 |
|
2638 |
-
| BEIR | e5-mistral-7b-instruct | SFR-Embedding-Mistral | NV-Embed-v1 | Linq-Embed-Mistral | SFR-Embedding-2_R | gte-Qwen2-7B-instruct | stella_en _1.5B_v5 | bge-multilingual-gemma2 | bge-en-
|
2639 |
| :----------------: | :--------------------: | :-------------------: | :---------: | :----------------: | :---------------: | :-------------------: | :----------------: | :---------------------: | :----------------------: | :---------------------: |
|
2640 |
| **ArguAna** | 61.9 | 67.27 | 68.21 | 69.65 | 62.34 | 64.27 | 65.27 | 77.37 | 82.76 | **83.08** |
|
2641 |
| **ClimateFEVER** | 38.4 | 36.41 | 34.72 | 39.11 | 34.43 | **45.88** | 46.11 | 39.37 | 45.35 | 45.43 |
|
@@ -2666,8 +2666,8 @@ print(scores.tolist())
|
|
2666 |
| **Linq-Embed-Mistral** | 61.04 | 48.41 | 49.44 | **60.18** | 20.34 | 50.04 | 47.56 | 60.50 | 49.69 |
|
2667 |
| **gte-Qwen2-7B-instruct** | 63.46 | 51.20 | 54.07 | 54.20 | 22.31 | **58.20** | 40.27 | 58.39 | 50.26 |
|
2668 |
| **stella_en_1.5B_v5** | 61.99 | 50.88 | 53.87 | 58.81 | 23.22 | 57.26 | 44.81 | 61.38 | 51.53 |
|
2669 |
-
| **bge-en-
|
2670 |
-
| **bge-en-
|
2671 |
|
2672 |
**Long-Doc (en, Recall@10):**
|
2673 |
|
@@ -2680,8 +2680,8 @@ print(scores.tolist())
|
|
2680 |
| **Linq-Embed-Mistral** | 75.46 | 73.81 | 71.58 | 68.58 | 72.11 |
|
2681 |
| **gte-Qwen2-7B-instruct** | 63.93 | 68.51 | 65.59 | 65.26 | 65.45 |
|
2682 |
| **stella_en_1.5B_v5** | 73.17 | 74.38 | 70.02 | 69.32 | 71.25 |
|
2683 |
-
| **bge-en-
|
2684 |
-
| **bge-en-
|
2685 |
|
2686 |
|
2687 |
## Model List
|
@@ -2690,7 +2690,7 @@ print(scores.tolist())
|
|
2690 |
|
2691 |
| Model | Language | | Description | query instruction for retrieval [1] |
|
2692 |
|:--------------------------------------------------------------------------|:-------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------:|:--------:|
|
2693 |
-
| [BAAI/bge-en-
|
2694 |
| [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) | Multilingual | [Inference](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3#usage) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) | Multi-Functionality(dense retrieval, sparse retrieval, multi-vector(colbert)), Multi-Linguality, and Multi-Granularity(8192 tokens) | |
|
2695 |
| [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
|
2696 |
| [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
|
|
7 |
- mteb
|
8 |
license: apache-2.0
|
9 |
model-index:
|
10 |
+
- name: bge-en-icl
|
11 |
results:
|
12 |
- dataset:
|
13 |
config: en
|
|
|
2616 |
|
2617 |
## Evaluation
|
2618 |
|
2619 |
+
`bge-en-icl` achieve **state-of-the-art performance on both MTEB and Air-Bench leaderboard!**
|
2620 |
|
2621 |
- **MTEB**:
|
2622 |
|
|
|
2630 |
| **gte-Qwen2-7B-instruct** | 83.04 | 31.35 | 85.79 | 86.58 | **61.42** | 56.92 | 60.25 | 70.24 |
|
2631 |
| **stella_en_1.5B_v5** | 84.51 | **31.49** | 88.07 | 88.07 | 61.21 | 57.69 | 61.21 | 71.19 |
|
2632 |
| **bge-multilingual-gemma2** | 83.88 | 31.20 | 85.84 | 88.08 | 59.72 | 54.65 | 59.24 | 69.88 |
|
2633 |
+
| **bge-en-icl zero-shot** | 83.74 | 30.75 | 87.21 | 88.66 | 59.66 | 57.57 | 61.67 | 71.26 |
|
2634 |
+
| **bge-en-icl few-shot** | 84.25 | 30.77 | 88.38 | **88.99** | 59.82 | **57.89** | **62.16** | **71.69** |
|
2635 |
|
2636 |
- **BEIR**:
|
2637 |
|
2638 |
+
| BEIR | e5-mistral-7b-instruct | SFR-Embedding-Mistral | NV-Embed-v1 | Linq-Embed-Mistral | SFR-Embedding-2_R | gte-Qwen2-7B-instruct | stella_en _1.5B_v5 | bge-multilingual-gemma2 | bge-en-icl zero-shot | bge-en-icl few-shot |
|
2639 |
| :----------------: | :--------------------: | :-------------------: | :---------: | :----------------: | :---------------: | :-------------------: | :----------------: | :---------------------: | :----------------------: | :---------------------: |
|
2640 |
| **ArguAna** | 61.9 | 67.27 | 68.21 | 69.65 | 62.34 | 64.27 | 65.27 | 77.37 | 82.76 | **83.08** |
|
2641 |
| **ClimateFEVER** | 38.4 | 36.41 | 34.72 | 39.11 | 34.43 | **45.88** | 46.11 | 39.37 | 45.35 | 45.43 |
|
|
|
2666 |
| **Linq-Embed-Mistral** | 61.04 | 48.41 | 49.44 | **60.18** | 20.34 | 50.04 | 47.56 | 60.50 | 49.69 |
|
2667 |
| **gte-Qwen2-7B-instruct** | 63.46 | 51.20 | 54.07 | 54.20 | 22.31 | **58.20** | 40.27 | 58.39 | 50.26 |
|
2668 |
| **stella_en_1.5B_v5** | 61.99 | 50.88 | 53.87 | 58.81 | 23.22 | 57.26 | 44.81 | 61.38 | 51.53 |
|
2669 |
+
| **bge-en-icl zero-shot** | 64.61 | 54.40 | 55.11 | 57.25 | 25.10 | 54.81 | 48.46 | 63.71 | 52.93 |
|
2670 |
+
| **bge-en-icl few-shot** | **64.94** | **55.11** | **56.02** | 58.85 | **28.29** | 57.16 | **50.04** | **64.50** | **54.36** |
|
2671 |
|
2672 |
**Long-Doc (en, Recall@10):**
|
2673 |
|
|
|
2680 |
| **Linq-Embed-Mistral** | 75.46 | 73.81 | 71.58 | 68.58 | 72.11 |
|
2681 |
| **gte-Qwen2-7B-instruct** | 63.93 | 68.51 | 65.59 | 65.26 | 65.45 |
|
2682 |
| **stella_en_1.5B_v5** | 73.17 | 74.38 | 70.02 | 69.32 | 71.25 |
|
2683 |
+
| **bge-en-icl zero-shot** | 78.30 | 78.21 | 73.65 | 67.09 | 73.75 |
|
2684 |
+
| **bge-en-icl few-shot** | **79.63** | **79.36** | **74.80** | 67.79 | **74.83** |
|
2685 |
|
2686 |
|
2687 |
## Model List
|
|
|
2690 |
|
2691 |
| Model | Language | | Description | query instruction for retrieval [1] |
|
2692 |
|:--------------------------------------------------------------------------|:-------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------:|:--------:|
|
2693 |
+
| [BAAI/bge-en-icl](https://huggingface.co/BAAI/bge-en-icl) | English | - | A LLM-based dense retriever with in-context learning capabilities can fully leverage the model's potential based on a few shot examples(4096 tokens) | Provide instructions and few-shot examples freely based on the given task. |
|
2694 |
| [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) | Multilingual | [Inference](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3#usage) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3) | Multi-Functionality(dense retrieval, sparse retrieval, multi-vector(colbert)), Multi-Linguality, and Multi-Granularity(8192 tokens) | |
|
2695 |
| [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
|
2696 |
| [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|