Bug about number generation?

#30
by myownskyW7 - opened

hi, thanks for your amazing work, but I found some strange output of the 7B-it model.
When it generates a number, such as a year, it generates <pad> in the following. My transformer version is 4.38.1 and the official example about poem and hello world works well

image.png

image.png

Google org

Could you share an exact snippet?

here is my code.

import torch
from transformers import AutoModel, AutoTokenizer, AutoModelForCausalLM
new_path = 'google/gemma-7b-it'
model = AutoModelForCausalLM.from_pretrained(new_path, device_map='cuda', torch_dtype=torch.float16, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(new_path, trust_remote_code=True)

For the first case

input_text = "Introducing Einstein"
input_ids = tokenizer(input_text, return_tensors="pt").to(model.device)

outputs = model.generate(**input_ids, max_length=300)
print(tokenizer.decode(outputs[0]))

For the second case

chat = [
    { "role": "user", "content": "Introducing history of USA" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=True, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150, num_beams=3)
print(tokenizer.decode(outputs[0]))

Same issue here! I was using the 2b-it model yesterday because 7b wasn't compatible with my cuda 11.7 driver version, and it was working fine. Now that they pushed a patch and 7b-it works with my cuda version, I get a bunch of pads similar to the poster. I'm asking the model to return a numbered list:

image.png

Google org

Hi there, Surya from the Gemma team -- sorry for the delay, I saw this issue elsewhere as well, are you using the right formatter? What are your sampling settings?

Sign up or log in to comment