Shitao commited on
Commit
2275a7b
·
1 Parent(s): 4bd7067

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -9
README.md CHANGED
@@ -2615,7 +2615,6 @@ language:
2615
  <a href=#usage>Usage</a> |
2616
  <a href="#evaluation">Evaluation</a> |
2617
  <a href="#train">Train</a> |
2618
- <a href="#contact">Contact</a> |
2619
  <a href="#citation">Citation</a> |
2620
  <a href="#license">License</a>
2621
  <p>
@@ -2626,13 +2625,19 @@ More details please refer to our Github: [FlagEmbedding](https://github.com/Flag
2626
 
2627
  [English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
2628
 
2629
- FlagEmbedding can map any text to a low-dimensional dense vector which can be used for tasks like retrieval, classification, clustering, or semantic search.
2630
- And it also can be used in vector databases for LLMs.
2631
 
2632
- ************* 🌟**Updates**🌟 *************
2633
- - 10/12/2023: Release [LLM-Embedder](./FlagEmbedding/llm_embedder/README.md), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Paper](https://arxiv.org/pdf/2310.07554.pdf) :fire:
 
 
 
 
 
 
 
2634
  - 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) of BGE has been released
2635
- - 09/15/2023: The [masive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
2636
  - 09/12/2023: New models:
2637
  - **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
2638
  - **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
@@ -2657,6 +2662,7 @@ And it also can be used in vector databases for LLMs.
2657
 
2658
  | Model | Language | | Description | query instruction for retrieval [1] |
2659
  |:-------------------------------|:--------:| :--------:| :--------:|:--------:|
 
2660
  | [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
2661
  | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
2662
  | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
@@ -2986,9 +2992,6 @@ The data format is the same as embedding model, so you can fine-tune it easily f
2986
  More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
2987
 
2988
 
2989
- ## Contact
2990
- If you have any question or suggestion related to this project, feel free to open an issue or pull request.
2991
- You also can email Shitao Xiao(stxiao@baai.ac.cn) and Zheng Liu(liuzheng@baai.ac.cn).
2992
 
2993
 
2994
  ## Citation
 
2615
  <a href=#usage>Usage</a> |
2616
  <a href="#evaluation">Evaluation</a> |
2617
  <a href="#train">Train</a> |
 
2618
  <a href="#citation">Citation</a> |
2619
  <a href="#license">License</a>
2620
  <p>
 
2625
 
2626
  [English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md)
2627
 
2628
+ FlagEmbedding focus on retrieval-augmented LLMs, consisting of following projects currently:
 
2629
 
2630
+ - **Fine-tuning of LM** : [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail)
2631
+ - **Dense Retrieval**: [LLM Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), [BGE Embedding](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding), [C-MTEB](https://github.com/FlagOpen/FlagEmbedding/tree/master/C_MTEB)
2632
+ - **Reranker Model**: [BGE Reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
2633
+
2634
+
2635
+ ## News
2636
+
2637
+ - 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:
2638
+ - 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
2639
  - 09/15/2023: The [technical report](https://arxiv.org/pdf/2309.07597.pdf) of BGE has been released
2640
+ - 09/15/2023: The [massive training data](https://data.baai.ac.cn/details/BAAI-MTP) of BGE has been released
2641
  - 09/12/2023: New models:
2642
  - **New reranker model**: release cross-encoder models `BAAI/bge-reranker-base` and `BAAI/bge-reranker-large`, which are more powerful than embedding model. We recommend to use/fine-tune them to re-rank top-k documents returned by embedding models.
2643
  - **update embedding model**: release `bge-*-v1.5` embedding model to alleviate the issue of the similarity distribution, and enhance its retrieval ability without instruction.
 
2662
 
2663
  | Model | Language | | Description | query instruction for retrieval [1] |
2664
  |:-------------------------------|:--------:| :--------:| :--------:|:--------:|
2665
+ | [LM-Cocktail](https://huggingface.co/Shitao) | English | | fine-tuned models (Llama and BGE) which can be used to reproduce the results of LM-Cocktail | |
2666
  | [BAAI/llm-embedder](https://huggingface.co/BAAI/llm-embedder) | English | [Inference](./FlagEmbedding/llm_embedder/README.md) [Fine-tune](./FlagEmbedding/llm_embedder/README.md) | a unified embedding model to support diverse retrieval augmentation needs for LLMs | See [README](./FlagEmbedding/llm_embedder/README.md) |
2667
  | [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
2668
  | [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) | Chinese and English | [Inference](#usage-for-reranker) [Fine-tune](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/reranker) | a cross-encoder model which is more accurate but less efficient [2] | |
 
2992
  More details please refer to [./FlagEmbedding/reranker/README.md](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/reranker)
2993
 
2994
 
 
 
 
2995
 
2996
 
2997
  ## Citation