Papers
arxiv:2308.04592

Shepherd: A Critic for Language Model Generation

Published on Aug 8, 2023
· Featured in Daily Papers on Aug 10, 2023
Authors:
,
,
,

Abstract

As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs. In this work, we introduce Shepherd, a language model specifically tuned to critique responses and suggest refinements, extending beyond the capabilities of an untuned model to identify diverse errors and provide suggestions to remedy them. At the core of our approach is a high quality feedback dataset, which we curate from community feedback and human annotations. Even though Shepherd is small (7B parameters), its critiques are either equivalent or preferred to those from established models including ChatGPT. Using GPT-4 for evaluation, Shepherd reaches an average win-rate of 53-87% compared to competitive alternatives. In human evaluation, Shepherd strictly outperforms other models and on average closely ties with ChatGPT.

Community

Is the model released?

data is sooooooo important

Paper author

What is the model that the authors run critic on? (Which model is used to generate responses and subsequently criticized by Shepherd/ChatGPT/Alpaca?)

Paper author

What is the model that the authors run critic on? (Which model is used to generate responses and subsequently criticized by Shepherd/ChatGPT/Alpaca?)

Alpaca is used to generate responses.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2308.04592 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2308.04592 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2308.04592 in a Space README.md to link it from this page.

Collections including this paper 7