Does this model use the same format as openchat 3.5?

by tarruda - opened

I downloaded the Q6_K GGUF version and running on llama.cpp python API server(I'm using the same format as openchat 3.5), seems to have has a strange behavior at the end of a response:


Seeing the same thing with llama.cpp (not python) and the same GGUF:
Therefore, Jane is faster than Rahul.abbabbababbabbbababbabbbababbabbababbabababbabababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbababbab

I'm using KoboldCpp and I have a similar issue

Screenshot 2023-12-14 at 20-25-59 KoboldAI Lite.png

Screenshot 2023-12-14 at 20-30-00 KoboldAI Lite.png

Screenshot 2023-12-14 at 20-45-02 KoboldAI Lite.png

@imone You might wanna check this

OpenChat org

Just changed the EOS token, should be good now!

alpayariyak changed discussion status to closed

Issue is fixed on latest GGUF upload, thanks.

Sign up or log in to comment