Out of memory

#7
by wahab12 - opened

Traceback (most recent call last):
File "main.py", line 44, in
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
File "/home/administrator/.local/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "/home/administrator/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3298, in from_pretrained
) = cls._load_pretrained_model(
File "/home/administrator/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3686, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/administrator/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 741, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/home/administrator/.local/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 317, in set_module_tensor_to_device
new_value = value.to(device)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 1.95 GiB of which 2.75 MiB is free. Including non-PyTorch memory, this process has 1.95 GiB memory in use. Of the allocated memory 1.92 GiB is allocated by PyTorch, and 1.91 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I believe you just have way too little vram. Try using gguf is it uses your ram instead of vram.

Sign up or log in to comment