Gemma 4 26B QAT
#2
by leopoldius - opened
Have you considered training or merging these styletunes with their equivalent QAT checkpoint? google/gemma-4-26B-A4B-it-qat-q4_0-unquantized
Normally this wouldn't really be viable, Since finetuning probably erodes the quantization resilience training. But as this tune only targets a portion of the model, That resilience is intact in the rest of it, right? The only special consideration might be excluding the output layer from being converted from bf16 during quantization to preserve the perplexity gains.