Increasing the context window

#4
by maddiemii - opened

I'm loving this model, it's the best that I've been working with so far. I guess one thing that I don't yet understand is how to increase the context window so that it remembers the conversation for longer. I'd love to have 4096 or higher context window, but I don't yet understand where the limitation is. Is it in the base model itself and the way it's trained, or something else? Not a setting I can change? Thank you!

Great, glad to hear it.

I'm afraid that context window is baked into the model and cannot be increased. This applies to nearly all models available at the moment, including the major ones. For example, ChatGPT 3.5 has a limit of 4096 tokens, and ChatGPT 4 has two versions, one with 8k and one with 32k (not many people have access to 32k yet though).

There are some new models coming out that have much longer context lengths or methods to increase context length, like MPT which can be increased up to 65K (though I believe it then has massive VRAM requirements.) But generally speaking, existing models have a pre-defined context length which can't be increased. LLaMA released with a 2k context limit, and all models based on it therefore inherit that.

For existing models there are some techniques that can sometimes help. For example LangChain has a summarisation feature whereby in a chat situation where you're asking follow up questions it can automatically summarise past interactions to get the most out of your limited context window.

But other than that there's not much you can do right now I believe.

Sign up or log in to comment