config.json: max_position_embeddings vs. model_max_length vs. total context

#24
by FlareRebellion - opened

Hi,

the model card says "Context length: 128K" in config.json we have:

"max_position_embeddings": 8192,
"model_max_length": 131072,

What's the difference between these parameters? What do they mean in regards to max context?

looks like the "max_position_embeddings" has limited the input length.
When I try to gave a long input prompt, I got error:

'message': "This model's maximum context length is 8192 tokens..."

Does anyone else see this error? Is there a way to update the config locally?

I’m using GGUF of model w/o "plus" and not have seen the error, but it actually produces garbage given tokens with len>8192.

Sign up or log in to comment