https://github.com/ggerganov/llama.cpp/pull/6920
For those converting LLama-3 - BPE models, you'll have to read llama.cpp/#6920 for more context.
Make sure you're in the latest llama.cpp repo commit, then run the new convert-hf-to-gguf-update.py
script inside the repo, afterwards you need to manually copy the config files from llama.cpp\models\tokenizers\llama-bpe
into your downloaded model folder, replacing the existing ones.
Try again and the conversion procress should work as expected.
For those converting LLama-3 - BPE models, you'll have to read llama.cpp/#6920 for more context.
Make sure you're in the latest llama.cpp repo commit, then run the new
convert-hf-to-gguf-update.py
script inside the repo, afterwards you need to manually copy the config files fromllama.cpp\models\tokenizers\llama-bpe
into your downloaded model folder, replacing the existing ones.Try again and the conversion procress should work as expected.
I can't find any information on the issue page, do imatrix-s need to be generated again using this pr or do old imatrix files work?
Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.
It has been asked :3
I'll let you know what the answer if they respond before i fall asleep πΈ
Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.
I come bearing the worst news possible :3
You will need to regenerate importance matrices since they depend on how the input text was tokenized.
Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.
I imagine it would've been a nightmare pre GPU acceleration for imatrix π
Fun... now I'll have to find time to redo the last model. Although... better this than stagnation.