sample codes do not work

#21
by mengyahu - opened

Hi, I tried sample codes on this page and the ones on blog page. None of them works.

Example code from the blog:
from transformers import AutoTokenizer, pipeline
import torch

model = "google/gemma-7b-it"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = pipeline(
"text-generation",
model=model,
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda",
)

messages = [
{"role": "user", "content": "Who are you? Please, answer in pirate-speak."},
]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(
prompt,
max_new_tokens=256,
add_special_tokens=True,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
print(outputs[0]["generated_text"][len(prompt):])

attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)

RuntimeError: shape '[1, 21, 3072]' is invalid for input of size 86016

Other sample codes in this page also have similar errors:
attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
RuntimeError: shape '[1, 9, 3072]' is invalid for input of size 36864

Another example from this page:
model = "google/gemma-7b-it"

model_id = "gg-hf/gemma-7b-it"
dtype = torch.bfloat16

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="cuda",
torch_dtype=dtype,
)

chat = [
{ "role": "user", "content": "Write a hello world program" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/gg-hf/gemma-7b-it.
401 Client Error. (Request ID: Root=1-65d649ea-575cf19833d7d7e6652130cd;6509193c-c905-455a-ac88-004ea71906e3)

Cannot access gated repo for url https://huggingface.co/google/gemma-7b-it/resolve/main/config.json.
Repo model google/gemma-7b-it is gated. You must be authenticated to access it.
However, I have login with token and accept the terms on this page. I can access while using other sample codes.

fixed the shape errors with "pip install "torch>=2.1.1" -U

Google org

Patch release is done! Thanks all for the prompt report, and sorry for not catching ! pip install -U transformers will work as well now!

ArthurZ changed discussion status to closed

@ArthurZ Thanks! How about the google/gemma-7b-it that I dont have access to it? This only happens when I try the sample with chat format.

I also have this same issue. When I try to run the chat model as below:

model_id = "gg-hf/gemma-7b-it"
dtype = torch.bfloat16

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="cuda",
torch_dtype=dtype,
)

I get the following error:

OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/gg-hf/gemma-7b-it.
401 Client Error. (Request ID: Root=1-65d7f55a-1297c7d45f184a9b5e0ea81c;b553c06e-9c85-4119-82dd-b11c45c0e771)

Cannot access gated repo for url https://huggingface.co/google/gemma-7b-it/resolve/main/config.json.
Repo model google/gemma-7b-it is gated. You must be authenticated to access it.

@charly9999 assuming you've already presented a read-eligible account token for hugging face, you also need to accept the Google terms and conditions on the model card page.

Sign up or log in to comment