200k -> 4k

by ssaroya - opened Dec 16, 2023

Discussion

ssaroya

Dec 16, 2023

Hi, quick question, I was wondering if the 4k context restriction is something that can be easily removed?

Yhyu13

Dec 16, 2023

What do you mean, this model support at up to 200K context length in principle. You can hardly fit 200K context as it would cost around 40GB vram or so I believe. So most model loader like vllm, you need to specify a -max_model_len as 8192 for this model, which is large enough for most tasks

ssaroya

Dec 16, 2023

So this here

Yhyu13

Dec 16, 2023

So this here

I believe this is just a typo, should be 200K @TheBloke

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment