Reduce hallucinations
#13
by
bradley6597
- opened
Is there any way I can try to limit the number of hallucinations away from a given context?
Currently, it tries to answer outside of the context if the information isn't in the context rather than saying it can't answer. The prompt I am using is the same as the quick_pipeline.py on your space.
INSTRUCTION_KEY = "### Instruction:"
RESPONSE_KEY = "### Response:"
END_KEY = "### End"
INTRO_BLURB = "Below is an instruction that describes a task. Write a response that appropriately completes the request."
instruction = prompt + '\n' + context_str
prompt = """{intro}
{instruction_key}
{instruction}
{response_key}
""".format(
intro=INTRO_BLURB,
instruction_key=INSTRUCTION_KEY,
instruction=instruction,
response_key=RESPONSE_KEY,
)
Also my kwargs are:
generate_kwargs = {
"temperature": 0.01, #0.01
"top_p": 0.92, #0.99
"top_k": 3, # 3
"max_new_tokens": 512,
"use_cache": True,
"do_sample": True,
"eos_token_id": tokenizer.eos_token_id,
"pad_token_id": tokenizer.pad_token_id,
"repetition_penalty": 1.1, # 1.0 means no penalty, > 1.0 means penalty, 1.2 from CTRL paper
}
Any direction would be really helpful!
You can try messing with top_k
(3 is quite low) and repetition_penalty
(1.1 is somewhat high)
Otherwise you may want to fine-tune on whatever data you are working on? Without knowing the problem type it is hard to say more.
sam-mosaic
changed discussion status to
closed
I have the same problem, I’m trying to build a QA bot using this as the LLM component in the langchain was over docs pipeline.