intfloat/multilingual-e5-large-instruct · Cuda Error when using with HuggingFaceEmbedding of llamaindex

Mar 5

First, thanks for sharing the model. I've been trying to use this model as the embeddings engine with llamaindex, but it crashed with the cuda error "CUBLAS_STATUS_EXECUTION_FAILED", probably a memory accesing error.
The important thing is that i fixed it by stating that "max_size=512", so it occurred to me that it might be an error in some of the config files as there is a similar parameter that is 514 in the config.json and 512 in the tokenizer_config.json file,
I'm new to this field so i cant confirm it, but im posting this just in case anyone is having the same issue.

intfloat

Owner Mar 6

Yeah, you should specify max_length=512 when running the tokenizer, otherwise it will cause indexing errors for model forward.

About the reason why this model (based on xlm-roberta) has 514 position embeddings, please see discussions at https://github.com/facebookresearch/fairseq/issues/1187

glpcc

Mar 6

Ok thanks for the clarification!