Quantization formats are mixed again, not all consumer support (or have efficient implementations) for k-quants or mixed models . output.weight should be encoded as Q4_0.
output.weight
· Sign up or log in to comment