Inference Endpoints (dedicated) documentation

Create an Endpoint

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Create an Endpoint

After your first login, you will be directed to the Endpoint creation page. As an example, this guide will go through the steps to deploy distilbert-base-uncased-finetuned-sst-2-english for text classification.

1. Enter the Hugging Face Repository ID and your desired Endpoint name

Create

2. Select your Instance Configuration

Choose cloud provider, region, and instance type. If you’re looking for a specific cloud provider, region, or instance that you don’t yet see available, please let us know.

select region

3. Apply Automatic Scale-to-Zero

Or leave your Endpoint as is.

autoscaling

4. Define the Security Level for the Endpoint

define security

5. Customize your Endpoint

Feel free to customize your Endpoint further in Advanced Configuration: replica autoscaling, task, revision, framework, and container type are accessible in this section.

advanced configuration

6. Create your Endpoint

By clicking Create Endpoint. The cost estimate displayed is per hour and does not take autoscaling into account.

create endpoint

7. Wait for the Endpoint to build, initialize and run

Note that initializing time depends on the model size and typically takes between 1 to 5 minutes.

running

8. Test your Endpoint πŸŽ‰

This is possible in your Endpoint Overview utilizing Playground 🏁 !

playground