elinas/alpaca-30b-lora-int4 · No longer works in OOba

Apr 10, 2023

Likely converted in old GPTQ, no longer compaitble

storage = cls(wrap_storage=untyped_storage)
Traceback (most recent call last):
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\server.py", line 347, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\models.py", line 103, in load_model
model = load_quantized(model_name)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\GPTQ_loader.py", line 136, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\GPTQ_loader.py", line 61, in _load_quant
model.load_state_dict(safe_load(checkpoint), strict=False)
File "C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
size mismatch for model.layers.0.self_attn.k_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.k_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.o_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.o_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.q_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.q_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.v_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.v_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.mlp.down_proj.qzeros: copying a param with shape torch.Size([140, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.mlp.down_proj.scales: copying a param with shape torch.Size([140, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.mlp.gate_proj.qzeros: copying a param with shape torch.Size([52, 2240]) from checkpoint, the shape in current model is torch.Size([1, 2240]).
size mismatch for model.layers.0.mlp.gate_proj.scales: copying a param with shape torch.Size([52, 17920]) from checkpoint, the shape in current model is torch.Size([1, 17920])..

elinas

Owner Apr 10, 2023

It works fine for the stable version, you're just not using the correct GPTQ version as stated in his wiki. See https://huggingface.co/elinas/alpaca-30b-lora-int4#important---update-2023-04-05

vdruts

Apr 10, 2023

He made some recent changes which required the GPTQ version to be updated otherwise it didnt work with any other models. All other models are working with the latest GPTQ

elinas

Owner Apr 10, 2023

I'm not seeing that, point me to the commit. The only breaking changes I'm seeing is a transformers update which requires updating the tokenizer - see: https://github.com/oobabooga/text-generation-webui/issues/931#issuecomment-1501259027

vdruts

Apr 11, 2023

You're right. Actually I was on the correct branch of GPTQ, havent touched it.

vdruts

Apr 11, 2023

Confirmed. I'm on Oobas fork of GPTQ. Even pulled it fresh.

Loading elinas_alpaca-30b-lora-int4...
Found the following quantized model: models\elinas_alpaca-30b-lora-int4\alpaca-30b-4bit-128g.safetensors
Loading model ...
C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\safetensors\torch.py:99: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
with safe_open(filename, framework="pt", device=device) as f:
C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\torch_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
return self.fget.get(instance, owner)()
C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\torch\storage.py:899: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
storage = cls(wrap_storage=untyped_storage)

Traceback (most recent call last):
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\server.py", line 347, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\models.py", line 103, in load_model
model = load_quantized(model_name)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\GPTQ_loader.py", line 136, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
File "C:\Users\xxxx\Deep\text-diffusion-webui\text-generation-webui\modules\GPTQ_loader.py", line 61, in _load_quant
model.load_state_dict(safe_load(checkpoint), strict=False)
File "C:\Users\xxxx\Deep\text-diffusion-webui\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
size mismatch for model.layers.0.self_attn.k_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.k_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.o_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).
size mismatch for model.layers.0.self_attn.o_proj.scales: copying a param with shape torch.Size([52, 6656]) from checkpoint, the shape in current model is torch.Size([1, 6656]).
size mismatch for model.layers.0.self_attn.q_proj.qzeros: copying a param with shape torch.Size([52, 832]) from checkpoint, the shape in current model is torch.Size([1, 832]).

vdruts

Apr 11, 2023

For the record, I just downloaded and ran your vicuna model with the same settings and it runs fine, but this one does not. Thanks for your help!

vdruts

Apr 11, 2023

Just tried loading this model and it worked even though it threw the storage warnings > https://huggingface.co/MetaIX/Alpaca-30B-Int4-128G-Safetensors/discussions/new

elinas changed discussion status to closed Apr 11, 2023

vdruts

Apr 11, 2023

I realize you closed this, but this error is still present. I loaded another 30B model (which worked).

elinas

Owner Apr 12, 2023

Vicuna is quantized on the same commit. I can't think of anything else unless you're using the old .pt file.

elinas changed discussion status to open Apr 12, 2023

vdruts

Apr 12, 2023

No I was using the safe tensors.

Alpaca 30B 4-bit working with GPTQ versions used in Oobabooga's Text Generation Webui and KoboldAI. was this one?
This one is not vicuna, but yeah my vicuna works. No idea what's going on lol.

elinas

Owner Apr 12, 2023

These are the last releases of this model. If you cannot get them to work. Then wait until I release a better model.

elinas changed discussion status to closed Apr 12, 2023

vdruts

Apr 12, 2023

Thanks for your effort :)