How much memory is needed to create an embedding
I am trying to create a sentence embedding using a corpus that is ~40MB in size. But I keep getting Cuda out of memory errors. Why does it try to allocate such a huge amount of memory for such a small dataset? What am I doing wrong?
Training code: corpus_embeddings = encode_batch(sentence_encoder, tokenizer, corpus , "cuda")
, i haven't modified the given code in any way other than moving this to gpu. My CPU runs also crash due to memory issues. I am training on a single A6000 GPU with 48GB of ram
OutOfMemoryError: CUDA out of memory. Tried to allocate 80.32 GiB (GPU 0; 47.54 GiB total capacity; 20.52 GiB already allocated; 26.63 GiB free; 20.56 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF