π You picked an LLM for your work but then you find out it hallucinates! π€
π€ Your first thought might be to fine-tune it on more training data.... but should you? π οΈ
π This is what @Google is exploring in the paper "Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?" π΅οΈββοΈ
π When LLMs undergo supervised fine-tuning with new factual knowledge not present in their initial training data, there is a risk they might "hallucinate" or produce factually incorrect information. π¨
π The paper investigates how fine-tuning LLMs with new facts influences their ability to leverage pre-existing knowledge and the extent to which they generate errors. π
βοΈTechnical Setup:
π§ Approach: They introduce a system named SliCK (this stands for Sampling-based Categorization of Knowledge, don't even bother understanding how) to categorize knowledge into four levels (HighlyKnown, MaybeKnown, WeaklyKnown, and Unknown) based on how well the model's generated responses agree with known facts. ποΈ
π Experimental Setup: The study uses a controlled setup focusing on closed-book QA, adjusting the proportion of fine-tuning examples that introduce new facts versus those that do not. π§ͺ
π Here is the gist of the findings:
πΈ LLMs struggle to integrate new factual knowledge during fine-tuning, and such examples are learned slower than those consistent with the model's pre-existing knowledge. π’
π As LLMs learn from examples containing new knowledge, their propensity to hallucinate increases. π»
β±οΈ Early stopping during training can mitigate the risks of hallucinations by minimizing exposure to unlearned new facts. π
π§ Training LLMs mostly with known examples leads to better utilization of pre-existing knowledge, whereas examples introducing new knowledge increase the risk of generating incorrect information. ποΈ