how to use 128k context?

by mirek190 - opened Sep 1, 2023

Discussion

mirek190

Sep 1, 2023

•

edited Sep 1, 2023

How to use 128k context using llamacpp?

dillfrescott

Sep 1, 2023

I would also like to know, as the example in the readme only has 4096 set as the ctx length.

Green-Sky

Sep 3, 2023

llama.cpp currently does not support the yarn rope scaler.

DarkCoverUnleashed

Sep 3, 2023

•

edited Sep 3, 2023

llama.cpp currently does not support the yarn rope scaler.

So how exactly can we use this models 128k context?

Green-Sky

Sep 4, 2023

So how exactly can we use this models 128k context?

you dont. @TheBloke , there is currently no way to use this model(file) that i am aware of.

dillfrescott

Sep 4, 2023

He should probably put a note at the top of the readme saying that there is currently no way to use this

infocyde

Sep 5, 2023

*can't use it at it's current context limit. I got it working with smaller context (defeats the point) but the responses where really bad. But, I'm a noob so I could have been prompting the model incorrectly. Still props for this model being one of the first OS LLMs out the gate with the over 100K context window (at least potentially) and I look forward to seeing more refinements here.

Green-Sky

Sep 5, 2023

excerpts from their paper

looks like for coding tasks, the code llama models perform very well for long contexts as is

Green-Sky

Sep 5, 2023

pr for yarn (aka ntk v2) https://github.com/ggerganov/llama.cpp/pull/2268

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment