Issue Loading Model from Hugging Face with transformers in Python

#3
by eibeel - opened

Hello,

I'm encountering an issue while trying to load a model from Hugging Face using the transformers library in Python. Despite setting my environment up with the necessary authentication token and ensuring internet connectivity, I'm unable to load the model. Below is the code snippet I'm using:

python


import os
from transformers import AutoTokenizer, AutoModelForCausalLM

Set your Hugging Face token here

hf_token = "XXX"
os.environ['HF_TOKEN'] = hf_token

Name of the model you want to load

model_name = "mistralai/Mixtral-8x22B-v0.1"

try:
# Attempt to load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, add_eos_token=True, use_fast=True)
tokenizer.pad_token = tokenizer.unk_token
tokenizer.pad_token_id = tokenizer.unk_token_id
tokenizer.padding_side = 'left'
print("Tokenizer loaded successfully.")

# Attempt to load the model
model = AutoModelForCausalLM.from_pretrained(model_name)
print("Model loaded successfully.")

except Exception as e:
print(f"Error loading the tokenizer or model: {e}")


However, I receive the following error message:

Error loading the tokenizer or model: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like mistralai/Mixtral-8x22B-v0.1 is not the path to a directory containing a file named config.json. Check out your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.

The error suggests issues with either internet connectivity or the path to the model. I've confirmed that my internet connection is stable. I suspect there might be an issue with how the model is being referenced or with the Hugging Face hub itself.

Any assistance on how to resolve this issue would be greatly appreciated!

have the same error and still find no solutions

To first isolate if this is a network connectivity issue vs. a software/HF issue:

(1) I understand that your network connection is stable but can you successfully ping huggingface.co? Want to confirm that there isn’t a router or proxy issue blocking you.

(2) if so, can you load a different model successfully besides, mistralai/Mixtral-8x22B-v0.1?

(3) can you confirm that your token is properly set and has the right permissions?

(4) if you use offline mode, the model has to be cached locally and you can confirm if that’s present on your file system.

Just a suggestion, but here is how I would consider re-writing your code:

import os
from transformers import AutoTokenizer, AutoModelForCausalLM
from huggingface_hub import login
import requests

Set your Hugging Face token here

hf_token = "XXX"
os.environ['HF_TOKEN'] = hf_token

Name of the model we want to load

model_name = "mistralai/Mixtral-8x22B-v0.1"

f(x) to check internet connection

def check_internet_connection(url='https://huggingface.co'):
try:
response = requests.get(url, timeout=5)
return response.status_code == 200
except requests.ConnectionError:
return False

Function to verify model existence

def verify_model_exists(model_name):
url = f'https://huggingface.co/{model_name}/resolve/main/config.json'
try:
response = requests.head(url)
return response.status_code == 200
except requests.ConnectionError:
return False

Check internet connection

if not check_internet_connection():
print("No internet connection… check your network settings.")
else:
# Verify model existence
if not verify_model_exists(model_name):
print(f"Model {model_name} does not exist on Hugging Face.")
else:
try:
# Log in to Hugging Face
login(hf_token)

        # Attempt to load the tokenizer
        tokenizer = AutoTokenizer.from_pretrained(model_name, add_eos_token=True, use_fast=True)
        tokenizer.pad_token = tokenizer.unk_token
        tokenizer.pad_token_id = tokenizer.unk_token_id
        tokenizer.padding_side = 'left'
        print("Tokenizer loaded successfully.")
        
        # Attempt to load the model
        model = AutoModelForCausalLM.from_pretrained(model_name)
        print("Model loaded successfully.")
    except Exception as e:
        print(f"Error loading the tokenizer or model: {e}")
eibeel changed discussion status to closed
eibeel changed discussion status to open
eibeel changed discussion status to closed

Sign up or log in to comment