wzhouad/gemma-2-9b-it-WPO-HB · Fantastic Model

I recently discovered this model at the top of the AlpacaEval 2.0 community leaderboard, surpassing Princeton's SimPO Gemma fine-tune. I'm working with an 8-bit quantized version locally on a consumer GPU; my use case is interactive fiction. The model follows instructions exceptionally well once the prompts are dialed in. It has a unique and engaging writing style that I prefer to the SimPO or SPPO fine-tunes. It's particularly adept at maintaining consistency in complex storylines without losing track of details.

I'm curious if there's any possibility of adding a WPO-HB fine-tune of https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407 ?
While it's a slightly larger 12B model, it's still usable on consumer-grade GPUs at 5 or 6-bit quantization. Based on my comparisons of the official instruct models, I believe Nemo has significantly more potential than Gemma. However, Nemo currently lacks advanced fine-tuned versions; there are no SimPO or SPPO fine-tuned versions available, which limits its practical applications. I'd be interested to see if WPO fine-tuning on Mistral-Nemo could surpass even the gemma-2-9b-it-WPO-HB on AlpacaEval 2.0.

Thank you for your contributions to advancing the state of the art in this field
~ Joe G.