how to use 128k context?

#1
by mirek190 - opened

How to use 128k context using llamacpp?

I would also like to know, as the example in the readme only has 4096 set as the ctx length.

llama.cpp currently does not support the yarn rope scaler.

llama.cpp currently does not support the yarn rope scaler.

So how exactly can we use this models 128k context?

So how exactly can we use this models 128k context?

you dont. @TheBloke , there is currently no way to use this model(file) that i am aware of.

He should probably put a note at the top of the readme saying that there is currently no way to use this

*can't use it at it's current context limit. I got it working with smaller context (defeats the point) but the responses where really bad. But, I'm a noob so I could have been prompting the model incorrectly. Still props for this model being one of the first OS LLMs out the gate with the over 100K context window (at least potentially) and I look forward to seeing more refinements here.

image.png

image.png

excerpts from their paper

looks like for coding tasks, the code llama models perform very well for long contexts as is

Sign up or log in to comment