https://github.com/ggerganov/llama.cpp/pull/6920

#26
No description provided.

For those converting LLama-3 - BPE models, you'll have to read llama.cpp/#6920 for more context.

Make sure you're in the latest llama.cpp repo commit, then run the new convert-hf-to-gguf-update.py script inside the repo, afterwards you need to manually copy the config files from llama.cpp\models\tokenizers\llama-bpe into your downloaded model folder, replacing the existing ones.

Try again and the conversion procress should work as expected.

FantasiaFoundry changed pull request status to merged

For those converting LLama-3 - BPE models, you'll have to read llama.cpp/#6920 for more context.

Make sure you're in the latest llama.cpp repo commit, then run the new convert-hf-to-gguf-update.py script inside the repo, afterwards you need to manually copy the config files from llama.cpp\models\tokenizers\llama-bpe into your downloaded model folder, replacing the existing ones.

Try again and the conversion procress should work as expected.

I can't find any information on the issue page, do imatrix-s need to be generated again using this pr or do old imatrix files work?

Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.

It has been asked :3
I'll let you know what the answer if they respond before i fall asleep 😸

Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.

I come bearing the worst news possible :3

You will need to regenerate importance matrices since they depend on how the input text was tokenized.

here

Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.

I imagine it would've been a nightmare pre GPU acceleration for imatrix 😭

Fun... now I'll have to find time to redo the last model. Although... better this than stagnation.

Sign up or log in to comment