About Input validation error: `inputs` tokens + `max_new_tokens` must be <= 1512.

#47

by Holynull - opened Jun 20, 2023

Jun 20, 2023

I always got an following error from AWS Sagemaker Inference Endpoint. When I use langchain create a qa chain, its prompt template has more then 1000 tokens in general. How to solve this issue?

{"error":"Input validation error: `inputs` tokens + `max_new_tokens` must be <= 1512. Given: 1682 `inputs` tokens and 200 `max_new_tokens`","error_type":"validation"}

Garmisch

Jul 10, 2023

I also encountered the same problem. Did you find the solution?

jaypinho

Jul 20, 2023

Having the same issue with meta-llama/Llama-2-13b-chat-hf. Where is this 1512 limitation coming from?

ewandel

Jul 21, 2023

Support explained: You can configure those when creating your endpoint. On inference endpoints in the UI when creating, in the advanced configuration section there are two fields: "max input length" and "max number of tokens". The latter is where the 1512 limitation is coming from. Change to your needs. I find it tricky to figure out which values are ok for the model and the platform, so far it's been try&error.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment