No longer works in OOba

#15
by vdruts - opened

Likely converted in old GPTQ, no longer compaitble

storage = cls(wrap_storage=untyped_storage)
Traceback (most recent call last):
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\server.py", line 347, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\models.py", line 103, in load_model
model = load_quantized(model_name)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\GPTQ_loader.py", line 136, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\GPTQ_loader.py", line 61, in _load_quant
model.load_state_dict(safe_load(checkpoint), strict=False)
File "C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
size mismatch for model.layers.0.self_attn.k_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.k_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.o_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.o_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.q_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.q_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.v_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.v_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.mlp.down_proj.qzeros: copying a param with shape torch.Size([140, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.mlp.down_proj.scales: copying a param with shape torch.Size([140, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.mlp.gate_proj.qzeros: copying a param with shape torch.Size([52, 2240]) from checkpoint, the shape in current model is torch.Size([1, 2240]).
size mismatch for model.layers.0.mlp.gate_proj.scales: copying a param with shape torch.Size([52, 17920]) from checkpoint, the shape in current model is torch.Size([1, 17920])..

It works fine for the stable version, you're just not using the correct GPTQ version as stated in his wiki. See https://huggingface.co/elinas/alpaca-30b-lora-int4#important---update-2023-04-05

He made some recent changes which required the GPTQ version to be updated otherwise it didnt work with any other models. All other models are working with the latest GPTQ

I'm not seeing that, point me to the commit. The only breaking changes I'm seeing is a transformers update which requires updating the tokenizer - see: https://github.com/oobabooga/text-generation-webui/issues/931#issuecomment-1501259027

You're right. Actually I was on the correct branch of GPTQ, havent touched it.

Confirmed. I'm on Oobas fork of GPTQ. Even pulled it fresh.

Loading elinas_alpaca-30b-lora-int4...
Found the following quantized model: models\elinas_alpaca-30b-lora-int4\alpaca-30b-4bit-128g.safetensors
Loading model ...
C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\safetensors\torch.py:99: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
with safe_open(filename, framework="pt", device=device) as f:
C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.get(instance, owner)()
C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\torch\storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = cls(wrap_storage=untyped_storage)

Traceback (most recent call last):
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\server.py", line 347, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\models.py", line 103, in load_model
model = load_quantized(model_name)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\GPTQ_loader.py", line 136, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\GPTQ_loader.py", line 61, in _load_quant
model.load_state_dict(safe_load(checkpoint), strict=False)
File "C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
size mismatch for model.layers.0.self_attn.k_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.k_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.o_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.o_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.q_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).

For the record, I just downloaded and ran your vicuna model with the same settings and it runs fine, but this one does not. Thanks for your help!

Just tried loading this model and it worked even though it threw the storage warnings > https://huggingface.co/MetaIX/Alpaca-30B-Int4-128G-Safetensors/discussions/new

elinas changed discussion status to closed

I realize you closed this, but this error is still present. I loaded another 30B model (which worked).

Vicuna is quantized on the same commit. I can't think of anything else unless you're using the old .pt file.

elinas changed discussion status to open

No I was using the safe tensors.

Alpaca 30B 4-bit working with GPTQ versions used in Oobabooga's Text Generation Webui and KoboldAI. was this one?
This one is not vicuna, but yeah my vicuna works. No idea what's going on lol.

These are the last releases of this model. If you cannot get them to work. Then wait until I release a better model.

elinas changed discussion status to closed

Thanks for your effort :)

Sign up or log in to comment