google/gemma-7b-it · Bug about number generation?

Feb 22

hi, thanks for your amazing work, but I found some strange output of the 7B-it model.
When it generates a number, such as a year, it generates <pad> in the following. My transformer version is 4.38.1 and the official example about poem and hello world works well

ArthurZ

Google org Feb 22

Could you share an exact snippet?

myownskyW7

Feb 22

here is my code.

import torch
from transformers import AutoModel, AutoTokenizer, AutoModelForCausalLM
new_path = 'google/gemma-7b-it'
model = AutoModelForCausalLM.from_pretrained(new_path, device_map='cuda', torch_dtype=torch.float16, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(new_path, trust_remote_code=True)

For the first case

input_text = "Introducing Einstein"
input_ids = tokenizer(input_text, return_tensors="pt").to(model.device)

outputs = model.generate(**input_ids, max_length=300)
print(tokenizer.decode(outputs[0]))

For the second case

chat = [
    { "role": "user", "content": "Introducing history of USA" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=True, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150, num_beams=3)
print(tokenizer.decode(outputs[0]))

acondor99

Feb 22

Same issue here! I was using the 2b-it model yesterday because 7b wasn't compatible with my cuda 11.7 driver version, and it was working fine. Now that they pushed a patch and 7b-it works with my cuda version, I get a bunch of pads similar to the poster. I'm asking the model to return a numbered list:

suryabhupa

Google org Feb 29

Hi there, Surya from the Gemma team -- sorry for the delay, I saw this issue elsewhere as well, are you using the right formatter? What are your sampling settings?