Reduce hallucinations

#13
by bradley6597 - opened

Is there any way I can try to limit the number of hallucinations away from a given context?

Currently, it tries to answer outside of the context if the information isn't in the context rather than saying it can't answer. The prompt I am using is the same as the quick_pipeline.py on your space.

INSTRUCTION_KEY = "### Instruction:"
RESPONSE_KEY = "### Response:"
END_KEY = "### End"
INTRO_BLURB = "Below is an instruction that describes a task. Write a response that appropriately completes the request."
instruction = prompt + '\n' + context_str
prompt = """{intro}
{instruction_key}
{instruction}
{response_key}
""".format(
    intro=INTRO_BLURB,
    instruction_key=INSTRUCTION_KEY,
    instruction=instruction,
    response_key=RESPONSE_KEY,
)

Also my kwargs are:

generate_kwargs = {
            "temperature": 0.01, #0.01
            "top_p": 0.92, #0.99
            "top_k": 3, # 3
            "max_new_tokens": 512,
            "use_cache": True,
            "do_sample": True,
            "eos_token_id": tokenizer.eos_token_id,
            "pad_token_id": tokenizer.pad_token_id,
            "repetition_penalty": 1.1,  # 1.0 means no penalty, > 1.0 means penalty, 1.2 from CTRL paper
        }

Any direction would be really helpful!

You can try messing with top_k (3 is quite low) and repetition_penalty(1.1 is somewhat high)

Otherwise you may want to fine-tune on whatever data you are working on? Without knowing the problem type it is hard to say more.

sam-mosaic changed discussion status to closed

I have the same problem, I’m trying to build a QA bot using this as the LLM component in the langchain was over docs pipeline.

Sign up or log in to comment