Take a look: https://github.com/ggerganov/llama.cpp/pull/1508
This model is updated for the new GGML format
Great, are you planning to continue to keep it updated?Also may I ask how much RAM is needed to run the quantization
· Sign up or log in to comment