Hey, not sure if this matters given that the code path defaults to not quantizing lm head anyways afaict, but "lm_head" is missing from the quantization config for this model but is present for the base 405b fp8 model. Thanks!
· Sign up or log in to comment