rwitz/go-bruins · Good, But A Couple Issues

I extensively tested both the original Q bert and this additional fine-tune. And while this fine-tune is a solid performer it makes a couple things notably worse.

It increased hallucinations at the fringes of knowledge, such as in my pop-culture questions. This often happens when LLMs are excessively fine-tuned.
It made story telling far too stubborn to respect the user's prompt. Again, this happens when LLMs are excessive fine-tuned. That is, LLMs are in a battle to employ pre-packed story telling elements while respecting the user's prompt directives. Failing to do so results in absurd contradictions

Example: The pre-packaged need to build-suspense and knock on closed doors vs the user prompt stating he was caught stealing something. This played out as hearing footsteps then a knock on the door, and also the thief opening the door himself, yet still be caught stealing something off the counter. He obviously wouldn't have stolen something after hearing footsteps and a knock on the door, and certainly not when opening the door himself. Too much fine-tuning is both taking stories away from the user prompt and the countless stories within the foundational model itself.