Inference Endpoints (dedicated) documentation

πŸ€— Inference Endpoints

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

πŸ€— Inference Endpoints

πŸ€— Inference Endpoints offers a secure production solution to easily deploy any πŸ€— Transformers, Sentence-Transformers and Diffusion models from the Hub on dedicated and autoscaling infrastructure managed by Hugging Face.

A Hugging Face Endpoint is built from a Hugging Face Model Repository. When an Endpoint is created, the service creates image artifacts that are either built from the model you select or a custom-provided container image. The image artifacts are completely decoupled from the Hugging Face Hub source repositories to ensure the highest security and reliability levels.

πŸ€— Inference Endpoints support all of the πŸ€— Transformers, Sentence-Transformers and Diffusion tasks as well as custom tasks not supported by πŸ€— Transformers yet like speaker diarization and diffusion.

In addition, πŸ€— Inference Endpoints gives you the option to use a custom container image managed on an external service, for instance, Docker Hub, AWS ECR, Azure ACR, or Google GCR.

creation-flow

Documentation and Examples

Guides

Others