Hallucinations

by Ricepig - opened Nov 2, 2023

Nov 2, 2023

This model is hallucinating much worse than ChatGPT. It's nearly impossible to get any factually correct information out of it sadly.

CyberTimon

Nov 2, 2023

Yes I noticed the same with any 7b-13b models. Hallucinations flow slowly away with 30b+ models. Such small models just can't store much knowledge.

HumanityFTW

Nov 3, 2023

But how does it compare to other 7Bs? Afaict, it's better than mistral, or even perhaps Qwen14B.

Alignment-Lab-AI

OpenChat org Nov 3, 2023

This model is hallucinating much worse than ChatGPT. It's nearly impossible to get any factually correct information out of it sadly.

This is likely due to your sampling parameters, or prompt format. What type of behavior is it demonstrating?

imone

OpenChat org Nov 3, 2023

•

edited Nov 3, 2023

This model is hallucinating much worse than ChatGPT. It's nearly impossible to get any factually correct information out of it sadly.

We observed a low hallucination rate (and high TruthfulQA accuracy). Maybe the Mistral model needs lower temperatures due to its smaller weight norm? Set temperature = 0.5 and try it

imone

OpenChat org Nov 3, 2023

•

edited Nov 3, 2023

@Ricepig BTW can you try our demo here? Its default is 0.5 temperature https://openchat.team/

acrastt

Nov 7, 2023

This model is hallucinating much worse than ChatGPT. It's nearly impossible to get any factually correct information out of it sadly.

We observed a low hallucination rate (and high TruthfulQA accuracy). Maybe the Mistral model needs lower temperatures due to its smaller weight norm? Set temperature = 0.5 and try it

TruthfulQA...

NickyNicky

Nov 8, 2023

I thought it was the form of the prompt but, there are many allusions.

imone

OpenChat org Nov 8, 2023

Try lower temperature due to Mistral's much smaller weight norm (all evaluations are done with temp=0). Also, there are significantly more hallucinations when speaking lauguages other than English, likely because Mistral wasn't pre-trained on sufficient multilingual data.

Jason233

Dec 12, 2023

for 7B model, Hallucinations is inevitable....

CyberTimon

Dec 12, 2023

Yes sure, hope we will get something like openchat for Mixtral 8x7B.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment