gibberish results when context is greater 2048

This was converted before Phi 3 support was merged in llama.cpp so I reopened it.
Since the script convert-hf-to-gguf.py was also changed to support the new phi 3 architecture it, the gguf is outdated; might be using wrong tokens or are missing, it might be worth redoing the gguf in this repo

Bakanayatsu changed discussion status to open Apr 26, 2024

Bakanayatsu

Apr 26, 2024

•

edited Apr 26, 2024

a new phi 3 mini 128k gguf a day ago shows;

general.architecture	phi3
general.name	Phi3

This one has

general.architecture	llama
general.name	phi3

johnrachwanpruna

Pruna AI org Apr 26, 2024

Was this converted before Phi 3 support was merged in llama.cpp?
since the script convert-hf-to-gguf.py was also changed to support it, the gguf might be
using wrong tokens or are missing

We converted it with a custom PR on llama.cpp but we are in the process of updating the files with the latest version main. I will ping here when the update is done :)

Bakanayatsu

Apr 26, 2024

•

edited Apr 26, 2024

https://github.com/ggerganov/llama.cpp/issues/6849#issuecomment-2074899603
It seems that 128K context phi-3 is still being worked on, so it might be better to focus on the phi-3 4K instead.

johnrachwanpruna

Pruna AI org Apr 26, 2024

https://github.com/ggerganov/llama.cpp/issues/6849#issuecomment-2074899603
It seems that 128K context phi-3 is still being worked on, so it might be better to focus on the phi-3 4K instead.

Ah okay thanks for the heads up, will do! :)

johnrachwanpruna

Pruna AI org Apr 26, 2024

@Bakanayatsu 4k quants are now updated here

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment