@Jaward on Hugging Face: "Prompt Engineering: Playing A Game of Chance With LLMs. It's obvious these…"

Post

Prompt Engineering: Playing A Game of Chance With LLMs.

It's obvious these days, that trying to get the best out of LLMs resembles playing a game of chance, where the choice of prompts acts as your moves in shaping the model's responses, as you recursively seek the best.

Each prompt you craft carries the potential to lead the LLM down different paths, influencing the quality and relevance of its outputs. By experimenting with various prompts and observing how the model responds, you can uncover new insights into the inner workings of these complex systems and push the boundaries of what they can achieve.

Not long ago, this craftsmanship has been termed "Prompt Engineering", it's a job now. To better understand the "Engineering" of it, let's go through the paper by Google's Brain Team that shed light on it: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.

The paper starts off with a clear definition of Chain-of-Thought — a coherent series of intermediate natural language reasoning steps that lead to the final answer for a problem.

The researchers explored how generating a series of intermediate reasoning steps significantly improves the ability of large language models to perform complex reasoning. They found that such reasoning abilities "emerge naturally" in sufficiently large language models via a simple method called chain of thought prompting, where a few chain of thought demonstrations are provided as exemplars in prompting.

Experiments on three large language models showed that chain of thought prompting improves performance on a range of arithmetic, commonsense, and symbolic reasoning tasks. For instance, prompting a 540B-parameter language model with just eight chain of thought exemplars achieves state of the art accuracy on the GSM8K benchmark of math word problems, surpassing even finetuned GPT-3 with a verifier.

ReadMore: https://x.com/jaykef_/status/1767173517345485232?s=46&t=V2mWOpm9AdMX0spmmr0yNQ

Join the conversation