How to Deploy on Inference Endpoints - handler.py

#3
by brianjking - opened

Hello,

I'd love to deploy this to the Huggingface Inference Endpoint, however, it's missing the handler.py file.

Does anyone have any tips? https://huggingface.co/docs/inference-endpoints/guides/custom_handler

image.png

When I tried to deploy it I received this error:

56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Start loading image artifacts from huggingface.co
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Repository Revision: 6c0cf6bef6330a114473cb5cec43d7beeb2a74ac
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Used configuration:
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Repository ID: Salesforce/instructblip-flan-t5-xl
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Ignore regex pattern for files, which are not downloaded: tf*, flax*, rust*, *onnx, *safetensors, *mlmodel, *tflite, *tar.gz, *ckpt
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,192 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,193 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,192 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,902 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,903 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,903 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z Traceback (most recent call last):

brianjking changed discussion title from How to Deploy on Inference Endpoints to How to Deploy on Inference Endpoints - handler.py

Hi,

Have you implemented a handler script as explained in the guide above? If so, could you share this script?

@nielsr I tried to submit a PR, I have no idea if this works or not https://huggingface.co/Salesforce/instructblip-flan-t5-xl/discussions/5.

Any help? Thanks!

Hi,

There's no need to add the handler.py within this repository. You can just create a new model repository and add a handler.py script there, where you use:

model = InstructBlipForConditionalGeneration.from_pretrained("Salesforce/instructblip-flan-t5-xl")

in the init of the handler class.

Sign up or log in to comment