google/gemma-7b-it · Access to the gated repo & gemma-7b-it model from hugging face

I want to access the model so that, I can complete my project which is translating the text from English to various otherlanguages.

This model is linked to a GitHub repository, as per below link, I am cloning this repository in my local machine VS code. GitHub repo link: https://github.com/doctranslate-io/viet-translation-llm.

It gives me error as below:
OSError: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/google/gemma-7b-it.
401 Client Error. (Request ID: Root=1-67057062-38b576a8416d6a5720107b86;9d47d074-89e2-45ba-86a9-809d9cd8250e)

Cannot access gated repo for url https://huggingface.co/google/gemma-7b-it/resolve/main/config.json.
Access to model google/gemma-7b-it is restricted. You must have access to it and be authenticated to access it. Please log in.

P.S. Can someone help me point in right direction whom I should request for this access.

Hi @hk199

You're requesting access to a gated Gemma model. Here's how to authenticate from your server-side/local-machine environment:

Important: Accessing gated models directly from client-side environments (like web browsers) is not supported due to security risks.

Here's how to authenticate on your server/machine:

Step 1: Generate a Hugging Face User Access Token

Go to your Hugging Face settings: https://huggingface.co/settings/tokens

Click "New token".

Give your token a descriptive name (e.g., "HF_TOKEN").

We recommend keeping the default "Read" access.

Click "Generate a token" and copy the token to your clipboard.

Step 2: Set Up Authentication in Your Server-Side/local-machine Code

You'll need to set the HF_TOKEN environment variable within your server-side environment. How you do this depends on your specific setup, but here's a general example:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import os

# Set the access token as an environment variable 
os.environ["HF_TOKEN"] = "YOUR_TOKEN_HERE"  

tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b-it")
model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-7b-it",
    torch_dtype=torch.bfloat16,
    use_auth_token=True
)

input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))

If you encounter any further issues or have specific questions about your local machine-side setup, feel free to ask!