Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Inference API

The huggingface_hub library allows users to programmatically access the Inference API. For more information about the Accelerated Inference API, please refer to the documentation here.

class huggingface_hub.InferenceApi

< >

( repo_id: str task: typing.Optional[str] = None token: typing.Optional[str] = None gpu: bool = False )

Client to configure requests and make calls to the HuggingFace Inference API.

Example:

>>> from huggingface_hub.inference_api import InferenceApi

>>> # Mask-fill example
>>> inference = InferenceApi("bert-base-uncased")
>>> inference(inputs="The goal of life is [MASK].")
[{'sequence': 'the goal of life is life.', 'score': 0.10933292657136917, 'token': 2166, 'token_str': 'life'}]

>>> # Question Answering example
>>> inference = InferenceApi("deepset/roberta-base-squad2")
>>> inputs = {
...     "question": "What's my name?",
...     "context": "My name is Clara and I live in Berkeley.",
... }
>>> inference(inputs)
{'score': 0.9326569437980652, 'start': 11, 'end': 16, 'answer': 'Clara'}

>>> # Zero-shot example
>>> inference = InferenceApi("typeform/distilbert-base-uncased-mnli")
>>> inputs = "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!"
>>> params = {"candidate_labels": ["refund", "legal", "faq"]}
>>> inference(inputs, params)
{'sequence': 'Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!', 'labels': ['refund', 'faq', 'legal'], 'scores': [0.9378499388694763, 0.04914155602455139, 0.013008488342165947]}

>>> # Overriding configured task
>>> inference = InferenceApi("bert-base-uncased", task="feature-extraction")

>>> # Text-to-image
>>> inference = InferenceApi("stabilityai/stable-diffusion-2-1")
>>> inference("cat")
<PIL.PngImagePlugin.PngImageFile image (...)>

>>> # Return as raw response to parse the output yourself
>>> inference = InferenceApi("mio/amadeus")
>>> response = inference("hello world", raw_response=True)
>>> response.headers
{"Content-Type": "audio/flac", ...}
>>> response.content # raw bytes from server
b'(...)'

__init__

< >

( repo_id: str task: typing.Optional[str] = None token: typing.Optional[str] = None gpu: bool = False )

Parameters

  • repo_id (str) — Id of repository (e.g. user/bert-base-uncased).
  • task (str, optional, defaults None) — Whether to force a task instead of using task specified in the repository.
  • token (str, optional) — The API token to use as HTTP bearer authorization. This is not the authentication token. You can find the token in https://huggingface.co/settings/token. Alternatively, you can find both your organizations and personal API tokens using HfApi().whoami(token).
  • gpu (bool, optional, defaults False) — Whether to use GPU instead of CPU for inference(requires Startup plan at least).

Inits headers and API call information.

__call__

< >

( inputs: typing.Union[str, typing.Dict, typing.List[str], typing.List[typing.List[str]], NoneType] = None params: typing.Optional[typing.Dict] = None data: typing.Optional[bytes] = None raw_response: bool = False )

Parameters

  • inputs (str or Dict or List[str] or List[List[str]], optional) — Inputs for the prediction.
  • params (Dict, optional) — Additional parameters for the models. Will be sent as parameters in the payload.
  • data (bytes, optional) — Bytes content of the request. In this case, leave inputs and params empty.
  • raw_response (bool, defaults to False) — If True, the raw Response object is returned. You can parse its content as preferred. By default, the content is parsed into a more practical format (json dictionary or PIL Image for example).

Make a call to the Inference API.