I can only get this to work at 8192 context? In Oobabooga. I heard it could do more? Is that false?

#12
by Goldenblood56 - opened

I set N_CTX to 8192 and Truncate the prompt up to this length to 8192 and it works fine. But if I choose 16,384 instead once the context size gets above 9800 it started to generate scrambled or blank replies? I heard GGUF can handle up to 32k? But I don't know if Dolphin 2.1 Mistral has a limit of 8k? Thanks.

Cognitive Computations org

I couldn't get it to work greater than 8k either

Yes I think I read it's only 8K. Which is fine. Other models that can go up to 32K. However I kind of found out my PC is more or less limited to 8K anyways at 7B.

Sign up or log in to comment