Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
santiviquezΒ 
posted an update Jan 31
Post
Pretty novel idea on how to estimate *semantic* uncertainty. πŸ€”

Text generation tasks are challenging because a sentence can be written in multiple ways but still preserve its meaning.

For instance, "France's capital is Paris" means the same as "Paris is France's capital." πŸ‡«πŸ‡·

In uncertainty quantification, we often look at token-level probabilities to quantify how "confident" an LLM is about its output. However, in this paper, the authors look at uncertainty at a meaning level.

Their motivation is that meanings are especially important for LLMs' trustworthiness; a system can be reliable even with many different ways to say the same thing, but answering with inconsistent meanings shows poor reliability.

To estimate semantic uncertainty, they introduce an algorithm for clustering sequences that mean the same thing, based on the principle that two sentences mean the same thing if we can infer one from the other. πŸ”„πŸ€

Then, they determine the likelihood of each meaning and estimate the semantic entropy by summing probabilities that share a meaning.

There's a lot more to it, but their results look quite nice when compared with non-semantic approaches.

Paper: Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation (2302.09664)
In this post