ValueError: weight is on the meta device, we need a `value` to put in on 0.
I get the following error while running the example code given in the model card without changing anything.
Traceback (most recent call last):
File "/data2/dinura/dinura/MultilingualLLM/Llava-Bench/eval-GLM-4-9B.py", line 176, in <module>
model = Blip2ForConditionalGeneration.from_pretrained("Gregor/mblip-bloomz-7b", device_map="auto")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4049, in from_pretrained
dispatch_model(model, **device_map_kwargs)
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/accelerate/big_modeling.py", line 419, in dispatch_model
attach_align_device_hook_on_blocks(
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/accelerate/hooks.py", line 648, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/accelerate/hooks.py", line 648, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/accelerate/hooks.py", line 648, in attach_align_device_hook_on_blocks
attach_align_device_hook_on_blocks(
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/accelerate/hooks.py", line 608, in attach_align_device_hook_on_blocks
add_hook_to_module(module, hook)
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/accelerate/hooks.py", line 157, in add_hook_to_module
module = hook.init_hook(module)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/accelerate/hooks.py", line 275, in init_hook
set_module_tensor_to_device(module, name, self.execution_device, tied_params_map=self.tied_params_map)
File "/home/fahad/anaconda3/envs/dinllm/lib/python3.11/site-packages/accelerate/utils/modeling.py", line 354, in set_module_tensor_to_device
raise ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.")
ValueError: weight is on the meta device, we need a `value` to put in on 0.```
Sorry to hear that. Unfortunately, it works on my system(s). Googling the error yields similar issues so maybe those can help. Updating or reinstalling transformers and accelerate could also help. Also testing an older version of transformers could help to check if this is caused by some updates in the library.
Could you provide the requirements.txt with the specific versions.
It works with the most recent version of transformers and accelerate for me.
accelerate==0.34.2
certifi==2024.8.30
charset-normalizer==3.3.2
filelock==3.16.0
fsspec==2024.9.0
huggingface-hub==0.24.6
idna==3.8
Jinja2==3.1.4
MarkupSafe==2.1.5
mpmath==1.3.0
networkx==3.2.1
numpy==2.0.2
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.6.68
nvidia-nvtx-cu12==12.1.105
packaging==24.1
pillow==10.4.0
psutil==6.0.0
PyYAML==6.0.2
regex==2024.7.24
requests==2.32.3
safetensors==0.4.5
sympy==1.13.2
tokenizers==0.19.1
torch==2.4.1
tqdm==4.66.5
transformers==4.44.2
triton==3.0.0
typing_extensions==4.12.2
urllib3==2.2.2
Same error with this packages on Python 3.9.19 on a A100 GPU
So I actually managed to reproduce your problem. The issue seems to be device_map="auto"
- when I remove it entirely, it works fine. This seems to be a known issue sometimes (e.g. this issue: https://github.com/huggingface/transformers/issues/26700). Let me know if that helps.