Sentence Similarity
sentence-transformers
PyTorch
TensorFlow
Rust
Safetensors
Transformers
English
bert
feature-extraction
Inference Endpoints
5 papers

Memory is becoming fully exhausted during the generation of embeddings, leading to a complete server crash.

#41
by vobbilisettyjayadeep - opened

Hi all
I am trying to create embeddings for 15 lakh rows of data using sentence-transformers/all-MiniLM-L6-v2 for an application and upload embeddings to pgVector database.
While creating embeddings the server memory is getting completely exhausted and getting crashed.

Please help me here.

Sentence Transformers org

Hello!

I'm aware of this issue. The gist is that as more of the texts get turned into embeddings, the already processed embeddings all remain in memory until all texts have been processed. This can lead to high memory usage. My recommendation at this time is to chunk your texts and only process e.g. 1 lakh sentences at a time, upload those embeddings, and then do the next chunk.
Hope this helps.

  • Tom Aarsen

Hey @tomaarsen thank you for your reply,
For now i am just going POC. If this is successful i will scale up the same for 5 Crore + rows of data. In that case this way of implementing is not suggestable.
Is there any way to do parallel processing for creation of embeddings.

Sentence Transformers org

Yes, you can use https://sbert.net/docs/package_reference/SentenceTransformer.html#sentence_transformers.SentenceTransformer.encode_multi_process for encoding on multiple processes or multiple GPUs, but the memory issue might still persist then. Chunking remains a good option I think.

will check the above link and get back to you asap.
thanks !!!

Sign up or log in to comment