Dataset for importance matrix?

by RamoreRemora - opened Feb 11

Feb 11

•

I can't tell from the model card, what dataset did you use for generating the importance matrix?

There's also this new approach to importance matrix generation worth looking into:
https://github.com/ggerganov/llama.cpp/discussions/5006
There is also a link to the final dataset "20k_random_data.txt" further down in the discussion.
What's your opinion?

ikawrakow

Owner Feb 11

All models here and in the other repositories were created using imatrix from wiki.train.raw.

Concerning using (pseudo-)random data for the imatrix, I wrote a bit about that here.

RamoreRemora

Feb 22

Alright, thanks for your reply.
I've been experimenting with a few more imatrix datasets to middling results, I can't quite reach the ppl of your models. Did you quantize from the f16 version? I'm requanting from q8 because my system may not be able to load the f16 model, I'm wondering if I'm leaving something on the table (all other parameters are default).

RamoreRemora

Feb 27

Update:
I went through the process of downloading the 96GB of the original mixtral, converting it to f16 gguf and quantized from there. I also re-made the imatrixes (tho still calculated them from q8 mixtral) by bumping up the number of chunks considerably as well as creating versions using 512 and 4096 ctx size, still on the same wiki.train.raw dataset.
The improvement was too minuscule in all cases, nowhere near enough to bridge the gap. I also tried this and this imatrixes i found but they scored even worse than mine.

RamoreRemora

Mar 22

Update
I managed to claw back alot of the gap by simply using an old version of llama.cpp. Now I'm 0.01 ppl behind your models instead of 0.1 ppl. It seems IQ3_XXS is just inconsistent and i can't really figure out what the hell is happening.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment