Can't find a way to make it work with llama.cpp

by ZeroWw - opened

I'm trying to use gemma-7b with llama.cpp
I converted the model to gguf.
As I start the server and try to chat, the model answers correctly the first time (but very shortly) then starts talking to itself :(
Any idea?

Sign up or log in to comment