Multiple arrays of vectors returned from feature extraction inference endpoint

#2
by jleeds - opened

Hello, I've created an inference endpoint for feature extraction with the BAAI/bge-small-en-v1.5 model. When I send off a text string for embedding I am expecting a response containing a single array of 384 vectors. Instead I receive multiple arrays of 384 vectors (the number of arrays varies between 5 and 11).

When I tested with the hosted inference API on the models huggingface page the response was only a single array of 384 vectors. https://huggingface.co/BAAI/bge-small-en-v1.5?text=Hello+world

Why am I receiving multiple vector arrays as a response and can I limiti it to one? If not can I just take the first of these for my embedding?

Thanks!

Beijing Academy of Artificial Intelligence org

Hi, we use the first hidden state as the sentence embedding. You can refer to the code: https://github.com/FlagOpen/FlagEmbedding/tree/master#using-huggingface-transformers to get the sentence embedding.

Sign up or log in to comment