FYI: likely broken 128k - maybe not, maybe only PPL
#1
by
bartowski
- opened
Llama.cpp does not support longrope, even with the convert-hf-to-gguf.py change that allows it to fall through and "work", it's not a real implementation and this model falls apart even at 8k context with PPL over 500
https://github.com/ggerganov/llama.cpp/pull/8262#issuecomment-2206065393
edit: slaren seems to think it actually may only be an issue with the PPL tool, need more investigation, closing for now just in case i'm wrong
This comment has been hidden
bartowski
changed discussion title from
FYI: likely broken 128k
to FYI: likely broken 128k - maybe not, maybe only PPL
bartowski
changed discussion status to
closed
Haven't had the time to test them thoroughly... it might be.