Dataset for importance matrix?

#3
by RamoreRemora - opened

I can't tell from the model card, what dataset did you use for generating the importance matrix?

There's also this new approach to importance matrix generation worth looking into:
https://github.com/ggerganov/llama.cpp/discussions/5006
There is also a link to the final dataset "20k_random_data.txt" further down in the discussion.
What's your opinion?

All models here and in the other repositories were created using imatrix from wiki.train.raw.

Concerning using (pseudo-)random data for the imatrix, I wrote a bit about that here.

Alright, thanks for your reply.
I've been experimenting with a few more imatrix datasets to middling results, I can't quite reach the ppl of your models. Did you quantize from the f16 version? I'm requanting from q8 because my system may not be able to load the f16 model, I'm wondering if I'm leaving something on the table (all other parameters are default).

Update:
I went through the process of downloading the 96GB of the original mixtral, converting it to f16 gguf and quantized from there. I also re-made the imatrixes (tho still calculated them from q8 mixtral) by bumping up the number of chunks considerably as well as creating versions using 512 and 4096 ctx size, still on the same wiki.train.raw dataset.
The improvement was too minuscule in all cases, nowhere near enough to bridge the gap. I also tried this and this imatrixes i found but they scored even worse than mine.

Update
I managed to claw back alot of the gap by simply using an old version of llama.cpp. Now I'm 0.01 ppl behind your models instead of 0.1 ppl. It seems IQ3_XXS is just inconsistent and i can't really figure out what the hell is happening.

Sign up or log in to comment