May require reconversion due to llama.cpp enhancements

by concedo - opened Apr 29, 2024

Apr 29, 2024

The convert-hf-to-gguf.py script was recently updated to support llama 3 pretokenization, that fixed some incorrect regex merges. I believe that may require a reconversion and requantization of all llama 3 models.

https://github.com/ggerganov/llama.cpp/pull/6920

Orenguteng

Owner Apr 29, 2024

Thank you for bringing this to my attention. I will release an updated and improved version soon with this inluded!

Orenguteng

Owner May 6, 2024

@bartowski @QuantFactory @MaziyarPanahi https://www.reddit.com/r/LocalLLaMA/comments/1cltac3/part3_cause_to_issue_found_possible_bug_llama3/

bartowski

May 7, 2024

thanks for the ping, holding off until it's recognized by slaren or ggerganov, just to be sure the fix isn't yet another hack that'll need to be fixed lol

Orenguteng

Owner May 7, 2024

@bartowski Yeah lol, I'm doing more tests now with different regex and will continue my thread on llama.cpp with the findings

huggingfacess

May 8, 2024

@bartowski Yeah lol, I'm doing more tests now with different regex and will continue my thread on llama.cpp with the findings

Hi, is there any news about the v2 version?

Orenguteng

Owner May 8, 2024

@huggingfacess There's a huge thread about GGUF and llama.cpp issues linked here, I will see when I can get things in order for a new version, for now I wil have to verify things not working as intended.

bartowski

May 8, 2024

It seems that it's not actually a bug with the conversions but instead with the inference tools

Some tools add an additional BOS token which messes with generation

https://github.com/ggerganov/llama.cpp/issues/7062#issuecomment-2100027534

Orenguteng

Owner May 8, 2024

@bartowski Yes, the issue seems to be what I found, that instruct or fine tuned models should be infered with tye system tokens present as they had during inference ,regardless of empty system message string or not. It results in un-expected outputs if removed.

Orenguteng changed discussion status to closed May 8, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment