sample codes do not work

#21

by mengyahu - opened Feb 21

Discussion

mengyahu

Feb 21

Hi, I tried sample codes on this page and the ones on blog page. None of them works.

Example code from the blog:
from transformers import AutoTokenizer, pipeline
import torch

model = "google/gemma-7b-it"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = pipeline(
"text-generation",
model=model,
model_kwargs={"torch_dtype": torch.bfloat16},
device="cuda",
)

messages = [
{"role": "user", "content": "Who are you? Please, answer in pirate-speak."},
]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(
prompt,
max_new_tokens=256,
add_special_tokens=True,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
print(outputs[0]["generated_text"][len(prompt):])

attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)

RuntimeError: shape '[1, 21, 3072]' is invalid for input of size 86016

Other sample codes in this page also have similar errors:
attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
RuntimeError: shape '[1, 9, 3072]' is invalid for input of size 36864

Another example from this page:
model = "google/gemma-7b-it"

model_id = "gg-hf/gemma-7b-it"
dtype = torch.bfloat16

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="cuda",
torch_dtype=dtype,
)

chat = [
{ "role": "user", "content": "Write a hello world program" },
]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/gg-hf/gemma-7b-it.
401 Client Error. (Request ID: Root=1-65d649ea-575cf19833d7d7e6652130cd;6509193c-c905-455a-ac88-004ea71906e3)

Cannot access gated repo for url https://huggingface.co/google/gemma-7b-it/resolve/main/config.json.
Repo model google/gemma-7b-it is gated. You must be authenticated to access it.
However, I have login with token and accept the terms on this page. I can access while using other sample codes.

mengyahu

Feb 21

fixed the shape errors with "pip install "torch>=2.1.1" -U

ArthurZ

Google org Feb 22

Patch release is done! Thanks all for the prompt report, and sorry for not catching ! pip install -U transformers will work as well now!

ArthurZ changed discussion status to closed Feb 22

mengyahu

Feb 22

@ArthurZ Thanks! How about the google/gemma-7b-it that I dont have access to it? This only happens when I try the sample with chat format.

charly9999

Feb 23

I also have this same issue. When I try to run the chat model as below:

model_id = "gg-hf/gemma-7b-it"
dtype = torch.bfloat16

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="cuda",
torch_dtype=dtype,
)

I get the following error:

OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/gg-hf/gemma-7b-it.
401 Client Error. (Request ID: Root=1-65d7f55a-1297c7d45f184a9b5e0ea81c;b553c06e-9c85-4119-82dd-b11c45c0e771)

Cannot access gated repo for url https://huggingface.co/google/gemma-7b-it/resolve/main/config.json.
Repo model google/gemma-7b-it is gated. You must be authenticated to access it.

southeyc

Feb 23

@charly9999 assuming you've already presented a read-eligible account token for hugging face, you also need to accept the Google terms and conditions on the model card page.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment