CUDA out of memory

#91

by logan39522361tq - opened Dec 13, 2023

Dec 13, 2023

I use NVIDIA GeForce RTX 3090 GPU with 24GBRAM.
When I run this demo code, it turns out these tips:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU 0 has a total capacty of 23.69 GiB of which 185.62 MiB is free. Including non-PyTorch memory, this process has 23.50 GiB memory in use. Of the allocated memory 22.83 GiB is allocated by PyTorch, and 1.17 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

ArthurZ

Mistral AI_ org Dec 13, 2023

•

edited Dec 13, 2023

Could you specify which demo? Are you loading in float16, bfloat16? You should use accelerate with device_map = "auto" to overcome potential RAM issues. It should fit in 24GB

logan39522361tq

Dec 13, 2023

Could you specify which demo? Are you loading in float16, bfloat16? You should use accelerate with device_map = "auto" to overcome potential RAM issues. It should fit in 24GB

using this demo:

config.json file is:
"torch_dtype": "bfloat16"

device_map = "auto"
using this in start commond or some where?

ArthurZ

Mistral AI_ org Dec 13, 2023

Maybe try to reduce the number of tokens that are generated, 1000 seems like a lot. Try with a smaller number like 40 and go from there

logan39522361tq

Dec 13, 2023

Maybe try to reduce the number of tokens that are generated, 1000 seems like a lot. Try with a smaller number like 40 and go from there

It's doesnt help, the error appears at <model.to(device1)>

ArthurZ

Mistral AI_ org Dec 13, 2023

Then try load in 8 bits or use accelerate and device_map = "auto" :hug:

ybelkada

Dec 13, 2023

@logan39522361tq
Can you try to load the model as such:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1", torch_dtype=torch.float16, device_map="auto")
...

The snippet you shared will load the model in full precision (28GB), hence the GPU error you get

Alternatively you can also do:

# pip install bitsandbytes
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1", load_in_4bit=True)
...

logan39522361tq

Dec 14, 2023

I used this to init model
<model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1", torch_dtype=torch.float16, device_map="auto")>

but when I run
<model.to(device)>
it turns out this error(using python virtual env):
lib/python3.10/site-packages/accelerate/big_modeling.py", line 425, in wrapper
raise RuntimeError("You can't move a model that has some modules offloaded to cpu or disk."

ybelkada

Dec 14, 2023

this is because device_map="auto" has automatically offloaded your model into cpu or disk. What is your GPU total VRAM?

logan39522361tq

Dec 14, 2023

this is because device_map="auto" has automatically offloaded your model into cpu or disk. What is your GPU total VRAM?

NVIDIA GeForce RTX 3090 GPU with 24GBRAM

ybelkada

Dec 14, 2023

the model has 44B parameters (you need ~90GB VRAM to fit your GPU in half-precision), it will not fit into your GPU. Please consider running the model in 4bit precision - or use cpu / disk offloading at the risk of not being able to call model.to(device)

logan39522361tq

Dec 16, 2023

code:
model = AutoModelForCausalLM.from_pretrained(path, load_in_4bit=True)
config.json:
"torch_dtype": "bfloat16",

it turns out error:
model.to(device) is not supported for 4-bit or 8-bit bitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct dtype.

logan39522361tq changed discussion status to closed Dec 16, 2023

ybelkada

Dec 18, 2023

loading the model with quantization will automatically dispatch the model in the available devices, hence there is no need to call .to as it will also create issues with offloading as well

monu0112

Apr 9, 2024

•

edited Apr 9, 2024

I am getting this error if anyone can help me

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment