error model.generate()

#13

by NickyNicky - opened Feb 21, 2024

Feb 21, 2024

error images:

code:

model_id = "google/gemma-7b-it"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id, token=os.environ['HF_TOKEN'])
model = AutoModelForCausalLM.from_pretrained(model_id, 
                                             quantization_config=bnb_config, 
                                             device_map={"":0}, 
                                             token=os.environ['HF_TOKEN'])

%%time
chat = [
    { "role": "user", "content": "Write a hello world program" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

inputs = tokenizer.encode(prompt, add_special_tokens=True, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=250)


# Decode and print the output
text = tokenizer.batch_decode(outputs)[0]
print(text)

LuciferYagami

Feb 21, 2024

Facing the same Issue here

pcuenq

Google org Feb 21, 2024

I just tested locally and it works for me. Would you mind sharing the hardware you are running on? cc @ArthurZ @ybelkada in case they have any ideas.

LuciferYagami

Feb 21, 2024

I'm running it on Colab T4GPU. Some how the gemma-2b-it is running but the 7b-it is throwing the above error

acondor99

Feb 21, 2024

Same issue here with gemma-7b-it:

RuntimeError: shape '[1, 9, 3072]' is invalid for input of size 36864

LuciferYagami

Feb 21, 2024

•

edited Feb 21, 2024

And somehow, the model runs fine in Kaggle. I can use the gemma-7b-it in Kaggle but throwing the size error in Colab. On the flip side, the gemma-2b-it runs fine in Colab (but I donno how to control the output tokens generated. The generated response is not full but cut off in the middle, for example, for the question "Who are you?", the response I received was "I am a large language model, trained by Google. I am a")

NickyNicky

Feb 21, 2024

•

edited Feb 21, 2024

Curiously, the gemma-2b-it model works correctly but the 7b-it and 7b base model does not.

google colab t4, v100 and a100 GPU no work.

viole

Feb 21, 2024

•

edited Feb 21, 2024

same here -- gemma-7b and gemma-7b-it both fail on colab with the same error

lysandre

Google org Feb 21, 2024

Thanks all for reporting! I'm managing to reproduce using torch 2.1.0, but the error doesn't appear if I'm using torch 2.2.0.

Is it possible for you to share your torch version/upgrade it to 2.2.0 if not already the case and let us know if it helps?

NickyNicky

Feb 21, 2024

•

edited Feb 21, 2024

Google Colab

version error:

lysandre

Google org Feb 21, 2024

•

edited Feb 21, 2024

Hey all! The source of the code is the difference in the attention implementation. Using any version before 2.1.1 will use eager as sdpa isn't supported in torch in these versions. We will fix the models to work with these versions in transformers ASAP and release a patch; but in the meantime, we recommend using a torch version that satisfies torch>=2.1.1 in order to leverage the sdpa attention implementation, which works correctly.

Here is the necessary line to install the relevant pytorch version in colab:

pip install "torch>=2.1.1" -U

Please restart your runtime afterwards for it to leverage the updated pytorch version!

NickyNicky

Feb 21, 2024

•

edited Feb 21, 2024

work 2.2.0

# with this line of code it automatically updates torch to 2.2.0+cu121

!pip install torchaudio==2.2.0

NickyNicky

Feb 21, 2024

https://huggingface.co/google/gemma-7b/discussions/17

NickyNicky changed discussion status to closed Feb 21, 2024

sanchit-gandhi

Feb 21, 2024

Hey all! There's a PR to fix the "eager" attention in Transformers: https://github.com/huggingface/transformers/pull/29187. Once this is merged, we'll do a patch release and bump the latest PyPi version of Transformers to include this fix

cc @ArthurZ

ArthurZ

Google org Feb 22, 2024

Patch release is done! Thanks all for the prompt report, and sorry for not catching ! pip install -U transformers

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment