Compilade

compilade

AI & ML interests

None yet

Recent Activity

updated a model 13 days ago
compilade/quant-tests
View all activity

Organizations

None yet

compilade's activity

replied to bartowski's post 3 months ago
view reply

KLD measures the difference between 2 probability distributions, typically between a "ground truth" and a model prediction.

Yes, and ln(PPL(Q)/PPL(base)) from my understanding measures the difference between the probabilities for the "correct" tokens according to the test dataset (at least for the second half of each chunk (same as for KLD)). Which means it would be possible to somehow keep perplexity the same or better while also increasing KLD (by making the non-"correct" tokens have different probabilities).

This makes me wonder: do all of the token probabilities have to match closely for a quantized model to still be good?

I guess it depends on whether the goal is to make a faithful quantization, or an equally good model through quantization-aware fine-tuning.
The way imatrix works, it can't really "fine-tune" a model towards a lower perplexity, only prioritize error reduction in the quantization of the weights in the columns with more impact on the activations, so I would say that faithfulness to the full-precision model is the goal of the quantization in this case, and thus KLD feels more appropriate.

Of course, I might be wrong; I don't really have a full understanding of the statistics going on in perplexity and KL-divergence calculations.

However, for quantization-aware fine-tuning, then ln(PPL(Q)/PPL(base)) is likely a better indicator of a better quantization than KLD, unless the goal of the fine-tuning was actually to minimize KLD.

New activity in HF1BitLLM/Llama3-8B-1.58-100B-tokens 3 months ago

GGUF conversion

11
#3 opened 3 months ago by
compilade
New activity in mistralai/Mamba-Codestral-7B-v0.1 5 months ago

Update hardcoded filenames

1
#1 opened 5 months ago by
Wauplin
New activity in jondurbin/bagel-dpo-2.8b-v0.2 10 months ago

GGUF Please

3
#1 opened 12 months ago by
HR1777
New activity in clibrain/mamba-2.8b-instruct-openhermes 10 months ago

gguf

1
#1 opened about 1 year ago by
LaferriereJC
New activity in pansophic/rocket-3B about 1 year ago