Q5_X ggml models are not as accurate as the oldest version

#6
by KinzyLong - opened

here is an example of Q5_1 response:

You: write me an example email to explain why the sales not met the expectations.
KoboldAI: Sure, here' Home / News / 2018 / FDA Approves New Treatment For Rare Genetic Disorder
FDA Approves New Treatment For Rare Genetic Disorder
The U.S. Food and Drug Administration (FDA) has approved a new treatment for patients with Mucopolysaccharidosis type IIIB (MPS IIIB), a rare genetic disorder that affects the body’s ability to break down and recycle certain large molecules called glycosaminoglycans (GAGs).

Doesn't look right. The new q5_1 is supposed to be more accurate if anything. And it is on my system. No idea what's going on here - is this a one-off or do you consistently get bad responses like that?

Doesn't look right. The new q5_1 is supposed to be more accurate if anything. And it is on my system. No idea what's going on here - is this a one-off or do you consistently get bad responses like that?

it is reproducible. everytime asking similar questions like 'show me an example of xxxx', I will get a un-related random response.
I was running this Q5_1 model with llama.cpp

post the settings you use (hyper parameters - temp, top_p, etc)

From my application, I did same pre-handling for those parameters. anyway, you still can find them via the key:
{
"prompt_MaxLength":200,
"prompt_SamplerOrder":"[0,1,2,3,4,5,6]",
"prompt_RepPen":1.08,
"prompt_RepPenSlope":0.7,
"prompt_RepPenRange":1024,
"prompt_TopP":0.9,
"prompt_TopK":0,
"prompt_TopA":0,
"prompt_Typical":1,
"prompt_TFS":1,
"prompt_Temperature":0.62
}

Sign up or log in to comment