Issue with using the model in Spaces

#13

by gospacedev - opened Dec 19, 2023

Cognitive Computations org Dec 19, 2023

•

edited Dec 19, 2023

from huggingface_hub import InferenceClient

client = InferenceClient(
    "ehartford/dolphin-2.5-mixtral-8x7b"
)

When I try to use ehartford/dolphin-2.5-mixtral-8x7b in Spaces, I get these errors:

huggingface_hub.utils._errors.HfHubHTTPError: 403 Client Error: Forbidden for url: https://api-inference.huggingface.co/models/ehartford/dolphin-2.5-mixtral-8x7b (Request ID: bttrYLuVoD5jjxUm9RxFm)

The model ehartford/dolphin-2.5-mixtral-8x7b is too large to be loaded automatically (93GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).

requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://api-inference.huggingface.co/models/ehartford/dolphin-2.5-mixtral-8x7b

Is the API access of this model restricted, or its too large? But when I tried mistralai/Mixtral-8x7B-Instruct-v0.1 using InferenceClient, it works.

gospacedev changed discussion title from Errors when using the model in Spaces to Issue with using the model in Spaces Dec 19, 2023

pandora-s

Dec 19, 2023

You may be launching it on spaces, but u are still using the InferenceClient that uses the inference api. I guess you must be new around, models that require more than 10gb cannot be run with the free inference API, and running it on spaces using a free space will not be possible neither hardware speaking.

If you ever want to REALLY run the model itself, in a space or in your computer, you will not be using the inference API and instead be using transformers, AKA downloading the model on your computer and running it on your own hardware (or space hardware if you run it here)

I help I managed to be of some help !

pandora-s

Dec 19, 2023

from huggingface_hub import InferenceClient

client = InferenceClient(
    "ehartford/dolphin-2.5-mixtral-8x7b"
)

When I try to use ehartford/dolphin-2.5-mixtral-8x7b in Spaces, I get these errors:

huggingface_hub.utils._errors.HfHubHTTPError: 403 Client Error: Forbidden for url: https://api-inference.huggingface.co/models/ehartford/dolphin-2.5-mixtral-8x7b (Request ID: bttrYLuVoD5jjxUm9RxFm)

The model ehartford/dolphin-2.5-mixtral-8x7b is too large to be loaded automatically (93GB > 10GB). Please use Spaces (https://huggingface.co/spaces) or Inference Endpoints (https://huggingface.co/inference-endpoints).

requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://api-inference.huggingface.co/models/ehartford/dolphin-2.5-mixtral-8x7b

Is the API access of this model restricted, or its too large? But when I tried mistralai/Mixtral-8x7B-Instruct-v0.1 using InferenceClient, it works.

Now about Mixtral I agree that I was surprised too, but Mixtral does use originally an architecture quite unique that reduces considerably the amount of paremeters required to predict tokens for max efficiency, so that may be the reason.

gospacedev

Cognitive Computations org Dec 20, 2023

Thank you for helping me!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment