unable to mmap: AutoModelForCausalLM.from_pretrained / from safetensors import safe_open

#3
by jonasmockzf - opened

Hello, I try to load your model with the transformers module. But it wont allocate the memory to my GPUs. It tries to allocate to CPU RAM and fails.

CPU RAM: 8GB
GPU: 2 x Nvidia Tesla V100 32GB

There is enough space on my GPUs.. Could you may help with that issue? How do I force to load the model only to GPU or does is first load it to CPU RAM and then too GPU VRAM?

Error.PNG

If I upgrade the VM to 32GB CPU RAM it works fine. It allocates all to GPU VRAM... Why doesnt it work with 8GB CPU RAM if it wont be used lol

Sign up or log in to comment