Update README.md
Browse files
README.md
CHANGED
@@ -2615,7 +2615,6 @@ language:
|
|
2615 |
<a href=#usage>Usage</a> |
|
2616 |
<a href="#evaluation">Evaluation</a> |
|
2617 |
<a href="#train">Train</a> |
|
2618 |
-
<a href="#contact">Contact</a> |
|
2619 |
<a href="#citation">Citation</a> |
|
2620 |
<a href="#license">License</a>
|
2621 |
<p>
|
@@ -2626,13 +2625,19 @@ More details please refer to our Github: [FlagEmbedding](https://github.com/Flag
|
|
2626 |
|
2627 |
[English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
|
2628 |
|
2629 |
-
FlagEmbedding
|
2630 |
-
And it also can be used in vector databases for LLMs.
|
2631 |
|
2632 |
-
|
2633 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2634 |
- 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) of BGE has been released
|
2635 |
-
- 09/15/2023: The [
|
2636 |
- 09/12/2023: New models:
|
2637 |
- **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
|
2638 |
- **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
|
@@ -2657,6 +2662,7 @@ And it also can be used in vector databases for LLMs.
|
|
2657 |
|
2658 |
| Model | Language | | Description | query instruction for retrieval [1] |
|
2659 |
|:-------------------------------|:--------:| :--------:| :--------:|:--------:|
|
|
|
2660 |
| [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
|
2661 |
| [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
2662 |
| [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
@@ -2986,9 +2992,6 @@ The data format is the same as embedding model, so you can fine-tune it easily f
|
|
2986 |
More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
|
2987 |
|
2988 |
|
2989 |
-
## Contact
|
2990 |
-
If you have any question or suggestion related to this project, feel free to open an issue or pull request.
|
2991 |
-
You also can email Shitao Xiao(stxiao@baai.ac.cn) and Zheng Liu(liuzheng@baai.ac.cn).
|
2992 |
|
2993 |
|
2994 |
## Citation
|
|
|
2615 |
<a href=#usage>Usage</a> |
|
2616 |
<a href="#evaluation">Evaluation</a> |
|
2617 |
<a href="#train">Train</a> |
|
|
|
2618 |
<a href="#citation">Citation</a> |
|
2619 |
<a href="#license">License</a>
|
2620 |
<p>
|
|
|
2625 |
|
2626 |
[English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
|
2627 |
|
2628 |
+
FlagEmbedding focus on retrieval-augmented LLMs, consisting of following projects currently:
|
|
|
2629 |
|
2630 |
+
- **Fine-tuning of LM** : [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail)
|
2631 |
+
- **Dense Retrieval**: [LLM Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), [BGE Embedding](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding), [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB)
|
2632 |
+
- **Reranker Model**: [BGE Reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
|
2633 |
+
|
2634 |
+
|
2635 |
+
## News
|
2636 |
+
|
2637 |
+
- 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:
|
2638 |
+
- 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
|
2639 |
- 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) of BGE has been released
|
2640 |
+
- 09/15/2023: The [massive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
|
2641 |
- 09/12/2023: New models:
|
2642 |
- **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
|
2643 |
- **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
|
|
|
2662 |
|
2663 |
| Model | Language | | Description | query instruction for retrieval [1] |
|
2664 |
|:-------------------------------|:--------:| :--------:| :--------:|:--------:|
|
2665 |
+
| [LM-Cocktail](https://huggingface.co/Shitao) | English | | fine-tuned models (Llama and BGE) which can be used to reproduce the results of LM-Cocktail | |
|
2666 |
| [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
|
2667 |
| [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
2668 |
| [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
|
|
|
2992 |
More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
|
2993 |
|
2994 |
|
|
|
|
|
|
|
2995 |
|
2996 |
|
2997 |
## Citation
|