Reduce hallucinations

Is there any way I can try to limit the number of hallucinations away from a given context?

Currently, it tries to answer outside of the context if the information isn't in the context rather than saying it can't answer. The prompt I am using is the same as the on your space.

INSTRUCTION_KEY = "### Instruction:"
RESPONSE_KEY = "### Response:"
END_KEY = "### End"
INTRO_BLURB = "Below is an instruction that describes a task. Write a response that appropriately completes the request."
instruction = prompt + '\n' + context_str
prompt = """{intro}

Also my kwargs are:

generate_kwargs = {
            "temperature": 0.01, #0.01
            "top_p": 0.92, #0.99
            "top_k": 3, # 3
            "max_new_tokens": 512,
            "use_cache": True,
            "do_sample": True,
            "eos_token_id": tokenizer.eos_token_id,
            "pad_token_id": tokenizer.pad_token_id,
            "repetition_penalty": 1.1,  # 1.0 means no penalty, > 1.0 means penalty, 1.2 from CTRL paper

Any direction would be really helpful!

You can try messing with top_k (3 is quite low) and repetition_penalty(1.1 is somewhat high)

Otherwise you may want to fine-tune on whatever data you are working on? Without knowing the problem type it is hard to say more.

I have the same problem, I’m trying to build a QA bot using this as the LLM component in the langchain was over docs pipeline.

