Inference Endpoints not only allows you to customize your inference handler, but it also allows you to provide a custom container image. Those can be public images like
tensorflow/serving:2.7.3 or private Images hosted on Docker Hub, AWS ECR, Azure ACR, or Google GCR.
The creation flow of your Image artifacts from a custom image is the same as the base image. This means Inference Endpoints will create a unique image artifact derived from your provided image, including all Model Artifacts.
The Model Artifacts (weights) are stored under
/repository. For example, if you use
tensorflow/serving as your custom image, then you have to set `model_base_path=“/repository”:
tensorflow_model_server \ --rest_api_port=5000 \ --model_name=my_model \ --model_base_path="/repository"