gibberish results when context is greater 2048

#4
by Bakanayatsu - opened

Using koboldcpp 1.63, and as the title says. This also happens with 4k version

Pruna AI org

Using koboldcpp 1.63, and as the title says. This also happens with 4k version

Are you exclusively getting this in koboldcpp or also when you use llama.cpp itself ?

the latest llama.cpp version support phi 3 now, so this should be closed.

Bakanayatsu changed discussion status to closed
Pruna AI org

Great!

This was converted before Phi 3 support was merged in llama.cpp so I reopened it.
Since the script convert-hf-to-gguf.py was also changed to support the new phi 3 architecture it, the gguf is outdated; might be using wrong tokens or are missing, it might be worth redoing the gguf in this repo

Bakanayatsu changed discussion status to open

a new phi 3 mini 128k gguf a day ago shows;

general.architecture	phi3
general.name	Phi3

This one has

general.architecture	llama
general.name	phi3
Pruna AI org

Was this converted before Phi 3 support was merged in llama.cpp?
since the script convert-hf-to-gguf.py was also changed to support it, the gguf might be
using wrong tokens or are missing

We converted it with a custom PR on llama.cpp but we are in the process of updating the files with the latest version main. I will ping here when the update is done :)

https://github.com/ggerganov/llama.cpp/issues/6849#issuecomment-2074899603
It seems that 128K context phi-3 is still being worked on, so it might be better to focus on the phi-3 4K instead.

Pruna AI org

https://github.com/ggerganov/llama.cpp/issues/6849#issuecomment-2074899603
It seems that 128K context phi-3 is still being worked on, so it might be better to focus on the phi-3 4K instead.

Ah okay thanks for the heads up, will do! :)

Pruna AI org

@Bakanayatsu 4k quants are now updated here

Sign up or log in to comment