@santiviquez on Hugging Face: "So, I have this idea to (potentially) improve uncertainty quantification for…"

santiviquez

posted an update Feb 16, 2024

Post

So, I have this idea to (potentially) improve uncertainty quantification for LLM hallucination detection.

The premise is that not all output tokens of a generated response share the same importance. Hallucinations are more dangerous in the form of a noun, date, number, etc.

The idea is to have a "token selection" layer that filters the output token probabilities sequence. Then, we use only the probabilities of the relevant tokens to calculate uncertainty quantification metrics.

The big question is how we know which tokens are the relevant ones. 🤔

My idea is to use the output sequence (decoded one) and use an NLP model (it doesn't need to be a fancy one) to do entity recognition and part-of-speech tagging to the output sequence and then do uncertainty quantification only on the entities that we have set as relevant (nouns, dates, numbers, etc).

What are your thoughts? Have you seen anyone try this before?

Curious to see if anyone has tried this before and know if this would have an impact on the correlation with human-annotated evaluations.

santiviquez

Feb 16, 2024

@gsarti curious to know if you have seen something like this. It is very similar to a weighted version of UQ, but not exactly... haha

gsarti

Feb 16, 2024

Hey @santiviquez , this is quite similar to what we propose in the Context-sensitive Token Identification (CTI) our PECoRe framework (https://openreview.net/forum?id=XTHfNGI3zT), with the main difference that you define "salient" anything matching some heuristic (e.g. NER/POS), while for us the relevance is given by how the generated token probability is impacted by the presence/absence of context.

I'll make an ad-hoc post about it as soon as we have a demo, but the method is also integrated in the CLI our Inseq toolkit as inseq attribute-context: https://inseq.org/en/latest/main_classes/cli.html#attribute-context

Join the conversation