I suck at making this work for llama

#10

by ABX-AI - opened Apr 19

Apr 19

I've had some issues making the script work with llama, on one level because of the different vocab type, but I managed to do it with just direct llama.cpp commands and getting an fp32 out, and then making an f16 out of that.

However, I can't make the imatrix.dat generation work, as if there is a mismatch somewhere in the config. I'm probably going to continue trying to work it out, but if you have any tips on how to use this with llama3 let me know.

Lewdiculous

Apr 19

Yeah, so I haven't tried with Llamma-3 yet just now I was still making some changes to the script to accommodate some of your suggestions and other low hanging fruits. Um...

Lewdiculous

Apr 19

•

edited Apr 19

We wait for this issue then (for the script part, and I'll see if a new check has to be added once things settle down about Llama-3):
https://github.com/ggerganov/llama.cpp/issues/6690#issuecomment-2065278517

Nitral-AI

Apr 19

•

edited Apr 19

This is super easy to fix lmao: --vocab-type bpe @Lewdiculous @ABX-AI
Also made some changes to config.json and the generation _config. https://files.catbox.moe/u35p33.rar

Lewdiculous

Apr 19

•

edited Apr 19

Yeah I saw that part but...
The thing is then it'd need to be detected and I don't wanna have to check for this, might just add an alternative for Llama-3 for now then until this is addressed ~~or not addressed~~.

Thanks for files.

Lewdiculous

Apr 19

•

edited Apr 19

@ABX-AI Use his provided config files.

Lewdiculous

Apr 19

•

edited Apr 19

I want to be like you when I grow up.

Also made some changes to config.json and the generation _config. https://files.catbox.moe/u35p33.rar

@Nitral-AI Can I host your files in the repo as a fallback?

Nitral-AI

Apr 19

•

edited Apr 19

@Lewdiculous Absolutely my dude!

FantasiaFoundry

Owner Apr 19

•

edited Apr 19

Thanks mates.

FantasiaFoundry changed discussion status to closed Apr 19

Nitral-AI

Apr 19

Dont know if you have these.

Instruct and context presets for ST here: https://files.catbox.moe/lkclc9.rar

FantasiaFoundry changed discussion status to open Apr 19

Lewdiculous

Apr 19

Thanks. Still got some things to iron out.

Nitral-AI

Apr 19

No problem, decided to be a little more open out of the gate with my findings to help get some nice llama 3 finetunes out as quickly as possible. :) Now I'm back to working on giving you guys another unhinged banger. :)

FantasiaFoundry

Owner Apr 19

•

edited Apr 19

Looks okay now... Surely...

For Llama-3 models, at the moment, you have to use gguf-imat-llama-3.py and replace the config files with the ones in the llama-3-config-files folder for properly quanting and generating the imatrix data.

FantasiaFoundry changed discussion status to closed Apr 19

ABX-AI

Apr 19

•

edited Apr 19

Thanks :) I already did use the vocab-type option @Nitral-AI , otherwise it would have been impossible to get the f32 and f16 out at all, which I did, but then I guess getting the imat still didn't work and that was my issue. Thanks for the updates, I'll try the new script out <3

Lewdiculous

Apr 19

•

edited Apr 19

Nitral is actually smart, I'm just horny.

Hope this gets fixed soon so this option doesn't need to stay in the script, and then we to complaining about imatrix.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment