This is a very cool idea!

by ddh0 - opened 1 day ago

Finetuning only the output tensor is very clever!! I wonder how this isn't already a commonly used technique, it makes so much sense. We don't need to change what the model thinks, only how it translates its thoughts to speech. Cheap, effective, and much less room for error.

I haven't even tried this model yet (quantizing now), I'm just excited nerding out about this. Do you know if there is any prior research / literature about this technique?

Thanks :)

Gryphe

Owner 1 day ago

I didn't specifically look for any papers, though one can argue I referenced my 2023 self from around the time where I built MythoMax!

Radermacher quants were made available a while ago, I noticed. Just be careful with the IQ versions, Gemma 4 hates those for some reason.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment