How much GPU memory does internLM-chat-7b need?

#4
by PotatoesJay - opened

Problem:
When loading weight to GPU, I encounter this:

return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB (GPU 0; 23.66 GiB total capacity; 23.33 GiB already allocated; 73.44 MiB free; 23.33 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Reproduce:

  1. download weight from https://huggingface.co/internlm/internlm-chat-7b/tree/main;
  2. run
model_dir = '/home/lalala/Downloads/internLM-chat-7b'
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_dir, trust_remote_code=True).cuda()

Any body help me?

Sign up or log in to comment