BAAI/bge-large-zh-v1.5 · Asking an embedding model supporting both English and Chinese

Dec 7, 2023

Hi,

I am working on RAG (Retrieval-Augmented Generation) application and my local document includes both English and Chinese.

When I did testing for bge-large-zh-v1.5 and bge-large-en-v1.5, each of them works OK for their according language (Zh / En).

Besides, I tried bge-reranker-large but it doesn't work well for either English or Chinese. However, it works like a charm for re-ranker purpose.

So, do you have any plan to release an embedding model working well for both English and Chinese? If yes, can I know ETA?

Shitao

Beijing Academy of Artificial Intelligence org Dec 7, 2023

Hi, thanks for your interest in our work!
The reranker model directly computes the score of query and passage, and it cannot be used to map text into embedding.
We plan to release a new multilingual model in January.

phamvantoan

Dec 7, 2023

Thank you for your quick response!

Hope to see the new model in January!