CUDA Out of memory

by Afster - opened May 20, 2023

May 20, 2023

.OutOfMemoryError: CUDA out of memory. Tried to allocate 18.00 MiB (GPU 0; 8.00 GiB total capacity; 7.08 GiB already allocated; 0 bytes free; 7.31 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Getting this issue. Tried to change the pre-layer within Oooba-booga but no success.

boricuapab

May 20, 2023

•

edited May 20, 2023

I've been able to get responses on an rtx 2060 super 8gb card with the following flags in ooba

call python server.py --auto-devices --extensions api --model notstoic_pygmalion-13b-4bit-128g --model_type LLaMA --wbits 4 --groupsize 128 --no-cache --pre_layer 30

Zensu

May 22, 2023

I've been able to get responses on an rtx 2060 super 8gb card with the following flags in ooba

call python server.py --auto-devices --extensions api --model notstoic_pygmalion-13b-4bit-128g --model_type LLaMA --wbits 4 --groupsize 128 --no-cache --pre_layer 30

Not working for me

Afster changed discussion status to closed May 23, 2023

Afster changed discussion status to open May 23, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment