@santiviquez on Hugging Face: "Some of my results from experimenting with hallucination detection techniques…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

santiviquez

posted an update Feb 5, 2024

Post

Some of my results from experimenting with hallucination detection techniques for LLMs 🫨🔍

First, the two main ideas used in the experiments—using token probabilities and LLM-Eval scores—are taken from these three papers:

1. Looking for a Needle in a Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation (2208.05309)
2. SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models (2303.08896)
3. LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models (2305.13711)

In the first two, the authors claim that computing the average of the sentence-level token probabilities is the best heuristic for detecting hallucinations. And from my results, we do see a weak positive correlation between average token probabilities and ground truth. 🤔

The nice thing about this method is that it comes with almost no implementation cost since we only need the output token probabilities from the generated text, so it is straightforward to implement.

The third paper proposes an evaluation shema where we do an extra call to an LLM and kindly ask it to rate on a scale from 0 to 5 how good the generated text is on a set of different criteria. 📝🤖

I was able to reproduce similar results to those in the paper. There is a moderate positive correlation between the ground truth scores and the ones produced by the LLM.

Of course, this method is much more expensive since we would need one extra call to the LLM for every prediction that we would like to evaluate, and it is also very sensitive to prompt engineering. 🤷

gsarti

Feb 6, 2024

Would be curious to see your results using the ALTI method!

santiviquez

Feb 6, 2024

This is definitely on my list. Haven't gone through the paper, but planning to read it this week haha

In this post