Gemma 4 26B QAT

#2
by leopoldius - opened

Have you considered training or merging these styletunes with their equivalent QAT checkpoint? google/gemma-4-26B-A4B-it-qat-q4_0-unquantized

Normally this wouldn't really be viable, Since finetuning probably erodes the quantization resilience training. But as this tune only targets a portion of the model, That resilience is intact in the rest of it, right? The only special consideration might be excluding the output layer from being converted from bf16 during quantization to preserve the perplexity gains.

Sign up or log in to comment