https://github.com/ggerganov/llama.cpp/pull/6920

#26

by Lewdiculous - opened Apr 30, 2024

base: refs/heads/main

←

from: refs/pr/26

Discussion Files changed

-2

Lewdiculous

AetherArchitectural org Apr 30, 2024

No description provided.

https://github.com/ggerganov/llama.cpp/pull/69202d9ade45

FantasiaFoundry

AetherArchitectural org Apr 30, 2024

•

edited Apr 30, 2024

For those converting LLama-3 - BPE models, you'll have to read llama.cpp/#6920 for more context.

Make sure you're in the latest llama.cpp repo commit, then run the new convert-hf-to-gguf-update.py script inside the repo, afterwards you need to manually copy the config files from llama.cpp\models\tokenizers\llama-bpe into your downloaded model folder, replacing the existing ones.

Try again and the conversion procress should work as expected.

FantasiaFoundry changed pull request status to merged Apr 30, 2024

saishf

Apr 30, 2024

For those converting LLama-3 - BPE models, you'll have to read llama.cpp/#6920 for more context.

Make sure you're in the latest llama.cpp repo commit, then run the new convert-hf-to-gguf-update.py script inside the repo, afterwards you need to manually copy the config files from llama.cpp\models\tokenizers\llama-bpe into your downloaded model folder, replacing the existing ones.

Try again and the conversion procress should work as expected.

I can't find any information on the issue page, do imatrix-s need to be generated again using this pr or do old imatrix files work?

Lewdiculous

AetherArchitectural org Apr 30, 2024

Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.

saishf

Apr 30, 2024

It has been asked :3
I'll let you know what the answer if they respond before i fall asleep 😸

saishf

Apr 30, 2024

Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.

I come bearing the worst news possible :3

You will need to regenerate importance matrices since they depend on how the input text was tokenized.

here

saishf

Apr 30, 2024

Honestly that's kind of why I want to redo the quants, it's not mentioned and I'm not sure, if you can do us a favor and ask there that'd be good to know. Just to be safe I'll redo them from scratch, doesn't take that long.

I imagine it would've been a nightmare pre GPU acceleration for imatrix 😭

SolidSnacke

Apr 30, 2024

Fun... now I'll have to find time to redo the last model. Although... better this than stagnation.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment