For text similarity, which pooling strategy is better?

#17

by zhihengyu - opened May 21

Discussion

zhihengyu

May 21

Hi,

which pooling strategy is better for text similarity? CLS, MEAN or MAX?

CLS-strategy: like as demo, Using the output of the CLS-token
MEAN-strategy: computing the mean of all output vectors
MAX-strategy: computing a max-over-time of the output vectors.

Shitao

Beijing Academy of Artificial Intelligence org May 21

If you want to fine-tune a model from scratch: CLS ~= MEAN > MAX.
For bge model, you need to use cls because we fine-tune the model with cls polling.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment