FYI: likely broken 128k - maybe not, maybe only PPL

#1
by bartowski - opened

Llama.cpp does not support longrope, even with the convert-hf-to-gguf.py change that allows it to fall through and "work", it's not a real implementation and this model falls apart even at 8k context with PPL over 500

https://github.com/ggerganov/llama.cpp/pull/8262#issuecomment-2206065393

edit: slaren seems to think it actually may only be an issue with the PPL tool, need more investigation, closing for now just in case i'm wrong

This comment has been hidden
bartowski changed discussion title from FYI: likely broken 128k to FYI: likely broken 128k - maybe not, maybe only PPL
bartowski changed discussion status to closed

Haven't had the time to test them thoroughly... it might be.

Sign up or log in to comment