Inference Endpoints (dedicated) documentation

Update your Endpoint

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Update your Endpoint

You can update running Endpoints to change some of the configurations. However, if your endpoint is in a failed state, you need to create a new Endpoint. To update your endpoint you need to navigate to the “settings” tab.

You can update the instance type, autoscaling configuration, task and repository revision.

Instance size

You can update the instance size of your Endpoint in the Endpoint overview menu to match your evolving needs. For example, you can downsize to a smaller instance type if you don’t need the compute or alternatively, you can upgrade to a larger instance type if you need to increase your compute.

You’re able to update your current instance type: CPU or GPU. There is no ability to update from one instance type to another (CPU to GPU or vice versa).

Instance Type selection

Autoscaling

You can update the autoscaling configuration of your Endpoint in the settings menu. Adjust the minimum and maximum number of replicas to upscale or downscale your Endpoint. Learn more about autoscaling here.

Task

You can update the task of your running Endpoint in the settings menu. The task defines the pipeline type your Endpoint will use and the inference widget on the Endpoint overview.

Revision

You can update the revision of your running Endpoint in the settings menu. The revision defines the version of the model repository you want to use for inference.

< > Update on GitHub