ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on.

#35
by madhurjindal - opened

ValueError Traceback (most recent call last)
Cell In[17], line 25
22 print_trainable_parameters(model)
24 # Apply the accelerator. You can comment this out to remove the accelerator.
---> 25 model = accelerator.prepare_model(model)

File /vc_data/shankum/miniconda3/envs/llm2/lib/python3.11/site-packages/accelerate/accelerator.py:1392, in Accelerator.prepare_model(self, model, device_placement, evaluation_mode)
1389 if torch.device(current_device_index) != self.device:
1390 # if on the first device (GPU 0) we don't care
1391 if (self.device.index is not None) or (current_device_index != 0):
-> 1392 raise ValueError(
1393 "You can't train a model that has been loaded in 8-bit precision on a different device than the one "
1394 "you're training on. Make sure you loaded the model on the correct device using for example `device_map={'':torch.cuda.current_device() or device_map={'':torch.xpu.current_device()}"
1395 )
1397 if "cpu" in model_devices or "disk" in model_devices:
1398 raise ValueError(
1399 "You can't train a model that has been loaded in 8-bit precision with CPU or disk offload."
1400 )

ValueError: You can't train a model that has been loaded in 8-bit precision on a different device than the one you're training on. Make sure you loaded the model on the correct device using for example `device_map={'':torch.cuda.current_device() or device_map={'':torch.xpu.current_device()}

Are your tokens/ouptut from tokenizer, on the same device on which your model is loaded on?
If your model is on GPU, then make sure you update the token tensors , to GPU as well.

This is an accelerate issue where I am using a multi-gpu setup. I have used the same setup with other SLMs like Zephyr, Llama2 and they seem to work

Playing around with the accelerate settings fixed it for me.

@ChristianPalaArtificialy could you share the settings you are using

@madhurjindal what kind of hardware are you using? and how many gpus ?

@saireddy I am using 8xV100 32GB

@madhurjindal can you try using 4 of those than 8. its funny but this has fixed for me

same issue

@saireddy didn't fix it for me

Playing around with the accelerate settings fixed it for me.

could you elaborate more, please ? thanks

Meta Llama org

Hi everyone!
In order to fix this issue, you need to make sure to force-load the model into a single GPU and replicate that across all GPUs, to achieve this, please follow the solution proposed here: https://github.com/huggingface/accelerate/issues/1840#issuecomment-1683105994

Sign up or log in to comment