Hub

Inference Providers

Please refer to the Inference Providers Documentation for detailed information.

What is HF-Inference API?

HF-Inference API is one of the many providers available on the Hugging Face Hub. It is deployed by Hugging Face ourselves, using text-generation-inference for LLMs for instance. This service used to be called “Inference API (serverless)” prior to Inference Providers.

For more details about the HF-Inference API, check out its dedicated page.

What technology do you use to power the HF-Inference API?

The HF-Inference API is powered by Inference Endpoints under the hood.

Why don’t I see an inference widget, or why can’t I use the API?

For some tasks, there might not be support by any Inference Provider, and hence, there is no widget.

How can I see my usage?

To check usage across all providers, check out your billing page.

To check your HF-Inference usage specifically, check out the Inference Dashboard. The dashboard shows both your serverless and dedicated endpoints usage.

Is there programmatic access to Inference Providers?

Yes! We provide client wrappers in both JS and Python:

< > Update on GitHub