Run this model with llama.cpp, get gibberish output

#12
by xiaojinchuan - opened

I converted this model to ggml, and quantized it to 4bit using https://github.com/ggerganov/llama.cpp/blob/master/convert.py
and run the quantized model with llama-cpp-python, get gibberish output as below.

image.png

Sign up or log in to comment