BAAI/bge-base-en-v1.5 · Normalization of outputs

Sep 27, 2023

First of all, thanks for the nice work and sharing your model and results 🤗

In the model card you mention the normalization of outputs, but in your repository I stumbled upon this line. I got two questions:

Did you normalize the vectors for the retrieval tasks?
And (also for the retrieval tasks) did you report the results of the dot or the l2 metric in the model card?

And just a tiny remark, if you always want to use normalization, you could consider specifying it in the configuration like for this model.

Shitao

Beijing Academy of Artificial Intelligence org Sep 27, 2023

Hi,

The default similarity for the retrieval task is cosine. So whether normalizing the embedding doesn't influence the results. This command only impacts the clustering and classification task.
No. We compute the cosine similarity to retrieve relevant passages.
Thanks for your advice!🤗 We will consider to update the configuration.

Shitao

Beijing Academy of Artificial Intelligence org Sep 27, 2023

We have updated the configuration. Thanks for your solution again!