TheBloke/guanaco-65B-GPTQ

I met the problem when using the command
CUDA_VISIBLE_DEVICES=0,1 python llama_inference.py ${MODEL_DIR} --wbits 4 --groupsize 128 --load ${MODEL_DIR}/Guanaco-65B-GPTQ.safetensors --text "this is llama" --device=0

Traceback (most recent call last):
File "llama_inference.py", line 110, in
model = load_quant(args.model, args.load, args.wbits, args.groupsize, fused_mlp=args.fused_mlp)
File "llama_inference.py", line 56, in load_quant
model.load_state_dict(safe_load(checkpoint), strict=False)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM:
size mismatch for model.layers.0.self_attn.k_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([64, 1024]).
size mismatch for model.layers.0.self_attn.k_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([64, 8192]).
size mismatch for model.layers.0.self_attn.o_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([64, 1024]).
size mismatch for model.layers.0.self_attn.o_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([64, 8192]).
size mismatch for model.layers.0.self_attn.q_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([64, 1024]).
size mismatch for model.layers.0.self_attn.q_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([64, 8192]).
size mismatch for model.layers.0.self_attn.v_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([64, 1024]).
size mismatch for model.layers.0.self_attn.v_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([64, 8192]).
size mismatch for model.layers.0.mlp.down_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([172, 1024]).
size mismatch for model.layers.0.mlp.down_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([172, 8192]).
size mismatch for model.layers.0.mlp.gate_proj.qzeros: copying a param with shape torch.Size([1, 2752]) from checkpoint, the shape in current model is torch.Size([64, 2752]).
size mismatch for model.layers.0.mlp.gate_proj.scales: copying a param with shape torch.Size([1, 22016]) from checkpoint, the shape in current model is torch.Size([64, 22016]).
size mismatch for model.layers.0.mlp.up_proj.qzeros: copying a param with shape torch.Size([1, 2752]) from checkpoint, the shape in current model is torch.Size([64, 2752]).
size mismatch for model.layers.0.mlp.up_proj.scales: copying a param with shape torch.Size([1, 22016]) from checkpoint, the shape in current model is torch.Size([64, 22016]).
size mismatch for model.layers.1.self_attn.k_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([64, 1024]).
size mismatch for model.layers.1.self_attn.k_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([64, 8192]).
size mismatch for model.layers.1.self_attn.o_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([64, 1024]).
size mismatch for model.layers.1.self_attn.o_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([64, 8192]).
size mismatch for model.layers.1.self_attn.q_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([64, 1024]).
size mismatch for model.layers.1.self_attn.q_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([64, 8192]).
size mismatch for model.layers.1.self_attn.v_proj.qzeros: copying a param with shape torch.Size([1, 1024]) from checkpoint, the shape in current model is torch.Size([64, 1024]).
size mismatch for model.layers.1.self_attn.v_proj.scales: copying a param with shape torch.Size([1, 8192]) from checkpoint, the shape in current model is torch.Size([64, 8192]).
size mismatch for model.layers.1.mlp.down_proj.qzeros: copying a param with shape torch.Size