Papers
arxiv:2309.11495

Chain-of-Verification Reduces Hallucination in Large Language Models

Published on Sep 20, 2023
· Featured in Daily Papers on Sep 21, 2023
Authors:
,
,
,
,

Abstract

Generation of plausible yet incorrect factual information, termed hallucination, is an unsolved issue in large language models. We study the ability of language models to deliberate on the responses they give in order to correct their mistakes. We develop the Chain-of-Verification (CoVe) method whereby the model first (i) drafts an initial response; then (ii) plans verification questions to fact-check its draft; (iii) answers those questions independently so the answers are not biased by other responses; and (iv) generates its final verified response. In experiments, we show CoVe decreases hallucinations across a variety of tasks, from list-based questions from Wikidata, closed book MultiSpanQA and longform text generation.

Community

Very interesting. I recently created a basic way of doing this for complex blog article creation. Here is a short video: https://www.youtube.com/watch?v=RWCW648l8Ls

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

Super interesting in practice! I made a simple chain-of-verification (CoVe) app. You can try running CoVe to verify prompts like "Name 10 basketball players with 3 MVP awards". Try it here: https://chain-of-verification.streamlit.app/

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2309.11495 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2309.11495 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2309.11495 in a Space README.md to link it from this page.

Collections including this paper 27