Use in SageMaker

#13
by joehoyle - opened

When deploying the model to sagemaker via the hugging-face provided script, it's not clear how to call the inference APIs from the SageMaker Invoke API. Would it be possible to provide an example using aws sagemaker-runtime --cli-binary-format raw-in-base64-out invoke-endpoint --endpoint-name on what data should be posted for example, t2tt

Hmm it seems this perhaps relies on huggingface-transformers, and seamless-m4t-v2-large isn't in a released version of transformers. Presumably this means when SageMaker deployment is used it wouldn't work. It seems support for the upcoming transformers release may also need to be added in https://github.com/aws/sagemaker-huggingface-inference-toolkit/ before being able to deploy this model to SageMaker?

The latest release of transformers (4.36.2) contains the model - let us know how you get on!

hi @joehoyle transformers version is pinned on the python package https://github.com/aws/sagemaker-python-sdk, maybe you can open an issue there.
Since transformers is updated frequently, there is some manual work to make it compatible, for instance there is an open PR to enable transformers v4.32.0 https://github.com/aws/sagemaker-python-sdk/issues/4075

Sign up or log in to comment