Training Data File Mean?

#1
by Clausss - opened

same as title

Hey! The training/calibration data file is basically used to estimate the importance matrix. Hence, the matrix consist in diagonal elements of the activation expectation value matrix ⟨a a^T⟩ where entries correspond to the expected squared activation of a neuron, and during quantization those entries are used to weight the sq. error minimization of the associated weights in blocks (so to figure which parts are more important when quantizing) pretty much.
I recommend https://github.com/ggerganov/llama.cpp/pull/4861 and https://github.com/ggerganov/llama.cpp/discussions/5263 for more context :)
Also recommend using the PR I made this weekend instead https://huggingface.co/spaces/ggml-org/gguf-my-repo/discussions/78/files as it's more thorough and in this iteration of the space with the free CPU inference takes very long (https://huggingface.co/SixOpen/Phi-3-mini-4k-instruct-IQ4_NL-imat.gguf took approx. 3 hours for instance)
Hope it helps!

Thanks you

Clausss changed discussion status to closed

Sign up or log in to comment