RuntimeError: "compute_indices_weights_cubic" not implemented for 'Half'

#1
by dogged - opened

transformers 4.37.2
Python 3.8.13
Red Hat 4.8.5-44
RTX 4090
CUDA 12.2

When I try to use a float16 model using "model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()", I got the following error.

You are using a model of type internlmxcomposer2 to instantiate a model of type internlm. This is not supported for all configurations of models and can yield errors.
Set max length to 4096
Position interpolate from 24x24 to 35x35
Traceback (most recent call last):
File "", line 1, in
File "/home/users/xxx/anaconda3/envs/langchain/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
return model_class.from_pretrained(
File "/home/users/xxx/anaconda3/envs/langchain/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3594, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/home/users/xxx/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/358caed4fa8e8c8c18b5a6724e986b879a9c9c8e/modeling_internlm_xcomposer2.py", line 67, in init
self.vit = build_vision_tower()
File "/home/users/xxx/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/358caed4fa8e8c8c18b5a6724e986b879a9c9c8e/build_mlp.py", line 11, in build_vision_tower
return CLIPVisionTower(vision_tower)
File "/home/users/xxx/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/358caed4fa8e8c8c18b5a6724e986b879a9c9c8e/build_mlp.py", line 59, in init
self.resize_pos()
File "/home/users/xxx/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/358caed4fa8e8c8c18b5a6724e986b879a9c9c8e/build_mlp.py", line 88, in resize_pos
pos_tokens = torch.nn.functional.interpolate(
File "/home/users/xxx/anaconda3/envs/langchain/lib/python3.8/site-packages/torch/nn/functional.py", line 4028, in interpolate
return torch._C._nn.upsample_bicubic2d(input, output_size, align_corners, scale_factors)
RuntimeError: "compute_indices_weights_cubic" not implemented for 'Half'

What could be the reason? Thanks.

ckpt_path = 'internlm/internlm-xcomposer2-vl-7b'
tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(ckpt_path, device_map="cuda", trust_remote_code=True).eval().cuda().half()
model.tokenizer = tokenizer

Even with half-precision, my GPU memory is still insufficient. Can device_map be set to auto?

Or does it support multiple gpu's? I see a lot of device transfers in the code. I'm running into different devices when trying to "device_map='auto'".

model = AutoModel.from_pretrained('internlm/internlm-xcomposer2-vl-7b', trust_remote_code=True, device_map='auto').eval()

Traceback (most recent call last):
File "", line 2, in
File "/home/users/xxx/anaconda3/envs/intern_clean/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/users/xxx/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/57a2a43dbf0bdbbf1dcb1e275d0bede87466404f/modeling_internlm_xcomposer2.py", line 513, in chat
inputs, im_mask = self.interleav_wrap_chat(tokenizer, query, image, history, meta_instruction)
File "/home/users/xxx/.cache/huggingface/modules/transformers_modules/internlm/internlm-xcomposer2-vl-7b/57a2a43dbf0bdbbf1dcb1e275d0bede87466404f/modeling_internlm_xcomposer2.py", line 208, in interleav_wrap_chat
wrap_embeds = torch.cat(wrap_embeds, dim=1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:7! (when checking argument for argument tensors in method wrapper_cat)

Changing float16 to bfloat16 worked for me:

model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()

Sign up or log in to comment