Help getting the example working
#3
by
salamanders
- opened
I started with the base version, and attempted to sub-in the quantized version here:
import torch
from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor, AutoModelForCausalLM, BitsAndBytesConfig
from qwen_vl_utils import process_vision_info
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
"unsloth/Qwen2.5-VL-7B-Instruct-unsloth-bnb-4bit", # quantized "Qwen/Qwen2.5-VL-7B-Instruct",
attn_implementation="eager",
)
processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-7B-Instruct")
But this errors with:
(venv) /srv/projects/quenvl$ python3 quentest.py
`low_cpu_mem_usage` was None, now default to True since model is quantized.
Traceback (most recent call last):
File "/srv/projects/quenvl/quentest.py", line 5, in <module>
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/srv/projects/quenvl/venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 271, in _wrapper
(snip)
File "/srv/projects/quenvl/venv/lib/python3.12/site-packages/bitsandbytes/utils.py", line 196, in unpack_tensor_to_dict
json_bytes = bytes(tensor_data.cpu().numpy())
^^^^^^^^^^^^^^^^^
NotImplementedError: Cannot copy out of meta tensor; no data!
Any suggestions?
+1
Just a heads-up: I had issues with transformers v4.50. Downgrading to 4.49 fixed it for me.