Wrong version of llamacpp used for quanting

by gelukuMLG - opened May 3

May 3

I just run the model using the latest version of koboldcpp and it says that the model needs requanting as it does not use the bpe tokenizer fix.

PacmanIncarnate

Backyard AI org May 3

We are using the correct version of llama.cpp unless something went horribly wrong. I've used this model in Faraday and it appears to be working correctly, as in post-BPE fix. It might be something weird in how koboldcpp is reviewing that issue. It's also worth noting that command-r should not even be impacted by the BPE issue, as it's not a llama 3 model.

Jobaar

May 4

We are using the correct version of llama.cpp unless something went horribly wrong. I've used this model in Faraday and it appears to be working correctly, as in post-BPE fix. It might be something weird in how koboldcpp is reviewing that issue. It's also worth noting that command-r should not even be impacted by the BPE issue, as it's not a llama 3 model.

https://github.com/ggerganov/llama.cpp/pull/7063

brooketh

Backyard AI org May 6

We are using the correct version of llama.cpp unless something went horribly wrong. I've used this model in Faraday and it appears to be working correctly, as in post-BPE fix. It might be something weird in how koboldcpp is reviewing that issue. It's also worth noting that command-r should not even be impacted by the BPE issue, as it's not a llama 3 model.

https://github.com/ggerganov/llama.cpp/pull/7063

Unfortunately that PR did not exist at the time I did the quant. I will redo the quant.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment