TypeError: forward() takes 2 positional arguments but 3 were given

#5
by prabhatk579 - opened

Hi, When I'm trying to load the model in 4-bit configuration, I'm getting the following error:

Traceback (most recent call last):
  File ".../model_load_4k.py", line 38, in <module>
    response, _ = model.chat(
...
  File ".../.venv/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
TypeError: forward() takes 2 positional arguments but 3 were given

Here is the code for loading the model:

nf4_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_compute_dtype=torch.bfloat16,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
    )

# init model and tokenizer
model = AutoModel.from_pretrained(
    "internlm/internlm-xcomposer2-4khd-7b",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    # low_cpu_mem_usage=True,
    # load_in_4bit=True,
    cache_dir="./model",
    quantization_config=nf4_config,
).eval()

tokenizer = AutoTokenizer.from_pretrained(
    "internlm/internlm-xcomposer2-4khd-7b",
    trust_remote_code=True,
    cache_dir="./model",
    load_in_4bit=True,
)

###############
# First Round
###############

query = "<ImageHere>Illustrate the fine details present in the image"
image = "examples/example4.jpeg"
with torch.cuda.amp.autocast():
    response, _ = model.chat(
        tokenizer,
        query=query,
        image=image,
        hd_num=55,
        history=[],
        do_sample=False,
        num_beams=3,
    )
print(response)

How can I load the model in 4-bit as I have limited resources?

Sign up or log in to comment