lvkaokao/mistral-7b-finetuned-orca-dpo-v2 · Stubbornly Avoids and Sticks to Hallucinations

This LLM performed unusually well on my personal tests, and its TruthfulQA score is impressive.

However, after playing around with it there's a pattern of stubbornness across the board. For example, it will often stick to an hallucination, no matter how absurd, with the same gusto it sticks to the truth with. It might apologize for the error, but well then repeat it. This doesn't reduce test scores since a wrong answer is still a wrong answer. However, it makes the LLM far too frustrating to use considering the sheer number of hallucinations, especially with Mistrals.

And like I said, this issue manifests across the board, such as with story telling. With most other LLMs if a story choice contradicts the user's story prompt it will apologize and rewrite it with the stated correction. In contrast, this LLM often stubbornly starts defending its choice, or even after apologizing, repeats the same line when retelling the story. And it doesn't make any sense. I'm not saying things like the ball fell towards the ceiling so the LLM is stubbornly makes it always fall to the ground because that's how gravity works. I'm simply saying perfectly realistic and common things like make the character preparing to leave for a trip.

Stubbornness may be a good thing when LLMs get much better and hallucinate far less, but it just doesn't work with a 7b Mistral.