"Failed to load the weights ... AssertionError: Model architecture Qwen2ForCausalLM not supported yet."

by dakerholdings - opened 1 day ago

I attempted to load this on my M1 MacBook Air, and received the error "Failed to load the weights" ... "AssertionError: Model architecture Qwen2ForCausalLM not supported yet." (and indeed, hqq/engine/hf.py does not list that architecture amongst those supported yet!?).
...
compute_dtype = torch.bfloat16
...
Failed to load the weights
Traceback (most recent call last):
File "/Users/ds/run_smashed_vibethinker.py", line 21, in
model = HQQModelForCausalLM.from_quantized("PrunaAI/WeiboAI-VibeThinker-3B-HQQ-4bit-smashed", compute_dtype=compute_dtype)
File "/Users/ds/.pyenv/versions/3.10.4/lib/python3.10/site-packages/hqq/engine/base.py", line 85, in from_quantized
cls._check_arch_support(arch_key)
File "/Users/ds/.pyenv/versions/3.10.4/lib/python3.10/site-packages/hqq/engine/base.py", line 38, in _check_arch_support
assert arch in cls._HQQ_REGISTRY, (
AssertionError: Model architecture Qwen2ForCausalLM not supported yet.
...
[ ... and then, FWIW, goes on to list two other errors occurring during exception handling ... ]

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment