Gemma 4 26B QAT

by leopoldius - opened 6 days ago

Have you considered training or merging these styletunes with their equivalent QAT checkpoint? google/gemma-4-26B-A4B-it-qat-q4_0-unquantized

Normally this wouldn't really be viable, Since finetuning probably erodes the quantization resilience training. But as this tune only targets a portion of the model, That resilience is intact in the rest of it, right? The only special consideration might be excluding the output layer from being converted from bf16 during quantization to preserve the perplexity gains.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment