cuda out of memory exceptions

by sanipanwala - opened Feb 27, 2024

Feb 27, 2024

Hello,

I have 2 GPU of 24 GB RTX 4090 GPU.

I want to fine-tune the 70b model but it throws a cuda out of memory exceptions even though I have used Lora and BitsAndBytesConfig.

Let me know if I'm overlooking this or please give me suggestions.

Thanks.

faroncoder

Feb 29, 2024

I wondered whether you have tried 13b or 40b work before you moved up to 70b?

Possibility: Are you using a 32-bit floating point for training or inference? If so, consider switching to FP16, a 16-bit floating point.

Maybe emptying the memory cache would help by using torch.cuda.empty_cache() ?

sanipanwala

Mar 1, 2024

Hi @faroncoder ,

Thanks for your reply.

Yes I tried with 13b and it is working fine.
Yes, I'm using FP16 and torch.cuda.empty_cache().

Please check my Lora and BitsAndBytesConfig configuration.

bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=False,
)
config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=modules,
lora_dropout=0.1,
bias="none",
task_type="CAUSAL_LM",
)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment