@santiviquez on Hugging Face: "What if the retrieval goes wrong? 🐕 Retrieval Augmented Generation (RAG) is…"

Post

What if the retrieval goes wrong? 🐕

Retrieval Augmented Generation (RAG) is a strategy to alleviate LLM hallucinations and improve the quality of generated responses.

A standard RAG architecture has two main blocks: a Retriever and a Generator.

1️⃣ When the system receives an input sequence, it uses the Retriever to retrieve the top-K most relevant documents associated with the input sequence. These documents typically come from an external source (e.g., Wikipedia) and are then concatenated to the original input's context.

2️⃣ It then uses the Generator to generate a response given the gathered information in the first step.

But what happens if the retrieval goes wrong and the retrieved documents are of very low quality?

Well, in such cases, the generated response will probably be of low quality, too. 🫠

But here is where CRAG (Corrective RAG) *might* help. I say it might help because the paper is very new — only one week old, and I don't know if someone has actually tried this in practice 😅

However, the idea is to add a Knowledge Correction block between the Retrieval and Generation steps to evaluate the retrieved documents and correct them if necessary.

This step goes as follows:

🟢 If the documents are correct, they will be refined into more precise knowledge strips and concatenated to the original context to generate a response.

🔴 If the documents are incorrect, they will be discarded, and instead, the system searches the web for complementary knowledge. This external knowledge is then concatenated to the original context to generate a response.

🟡 If the documents are ambiguous, a combination of the previous two resolutions is triggered.

The experimental results from the paper show how the CRAG strategy outperforms traditional RAG approaches in both short and long-form text generation tasks.

Paper: Corrective Retrieval Augmented Generation (2401.15884)

Join the conversation