403 Forbidden error when accessing the model

#3
by imhaggarwal - opened

model_id = "elyza/ELYZA-japanese-Llama-2-7b-instruct"
llm_hub = HuggingFaceEndpoint(repo_id=model_id, temperature= 0.1, max_new_tokens=600, model_kwargs={"max_length": 600})

I am using the above code to load the model. Since the size of the model is more than my RAM I gues it won`t be possible to load it locally.
So I want to use the inference to load the model.

I am even setting the HuggingFace token using os.environ["HUGGINGFACEHUB_API_TOKEN"] but getting the below error:
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://api-inference.huggingface.co/models/elyza/ELYZA-japanese-Llama-2-7b-instruct

The same code works for other heavy models. I even tried changing the access token from Inference to Read & Write but did not work.

Does this have something to do with the HuggingFace plan?
Can anyone please help me with this?

Sign up or log in to comment