unable to mmap: AutoModelForCausalLM.from_pretrained / from safetensors import safe_open
#3
by
jonasmockzf
- opened
Hello, I try to load your model with the transformers module. But it wont allocate the memory to my GPUs. It tries to allocate to CPU RAM and fails.
CPU RAM: 8GB
GPU: 2 x Nvidia Tesla V100 32GB
There is enough space on my GPUs.. Could you may help with that issue? How do I force to load the model only to GPU or does is first load it to CPU RAM and then too GPU VRAM?
If I upgrade the VM to 32GB CPU RAM it works fine. It allocates all to GPU VRAM... Why doesnt it work with 8GB CPU RAM if it wont be used lol