YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Agentic RAG

Theoretical review

What is a Retrieval Augmented Generation (RAG)?

Published by Meta AI in NeurIPS 2020 -> Paper.

1. Overview:

  • Hybrid Model: Combines the strengths of retrieval-based and generation-based models.
  • Purpose: Enhances LLM text generation by incorporating relevant external information, improving accuracy and context.

2. Structure:

  • Retriever:
    • Function: Searches a large corpus (e.g., documents, articles) to find relevant information based on the input query.
    • Techniques: Uses methods like BM25, dense retrieval, or neural retrievers.
  • Generator:
    • Function: Generates a response using the information retrieved.
    • Model: Typically a transformer-based model like GPT-4 or Llama.

3. Workflow:

  1. Input Query: User provides a query or prompt.
  2. Document Retrieval:
    • The retriever fetches a set of relevant documents or passages.
    • These documents provide context and factual information.
  3. Response Generation:
    • The generator uses the retrieved documents to produce a coherent and contextually accurate response.
    • Ensures the generated text is informed by the most relevant information available.

4. Benefits:

  1. Enhanced Accuracy: By grounding responses in real-world data, RAG models significantly improve the accuracy of generated content.
  2. Reduced Hallucinations: The integration of external knowledge helps mitigate the risk of generating incorrect or nonsensical responses.
  3. Scalability: RAG systems can handle vast amounts of data, making them suitable for enterprise-level applications.

Pipeline

image.png


What is a Agentic-RAG?

1. Overview:

  • Enhanced RAG: Extends RAG by adding agent-like capabilities.
  • Purpose: Designed to perform tasks autonomously, interacting with various tools and APIs to achieve specific goals.

2. Structure:

  • Retriever:
    • Function: Similar to RAG, it fetches relevant documents based on the input query.
  • Generator:
    • Function: Generates an initial response using the retrieved documents.
  • Agent Module:
    • Function: Evaluates the generated response, cross-references it with the knowledge base, and makes corrections if discrepancies are found.

3. Workflow:

  1. Input Query: User provides a query/question/task.
  2. Document Retrieval: The retriever fetches relevant documents to provide context.
  3. Initial Response Generation: The generator creates a preliminary answer using the retrieved information.
  4. Response Verification: The agent system assesses the initial response against the knowledge base to ensure accuracy.
  5. Response Correction (if needed): If inaccuracies are detected, the agent system refines the response to align with verified information.

4. Benefits:

  1. Improved Reliability: The agent system's verification process ensures responses are accurate and trustworthy.
  2. Dynamic Correction: Enables real-time adjustments to responses, enhancing the system's adaptability to new information.
  3. User Trust: By providing verified answers, the system builds greater user confidence in its outputs.

Pipeline

image.png


Frameworks recommended for Agents

LangChain
LangGraph
AutoGen
SmolAgent
PydanticAI
Vector Database: Chroma

Frameworks recommended to develop user interfaces.

Streamlit
Gradio
Chainlit


Code Examples

References

  1. Video: What is Agentic RAG?
  2. Video: LangChain vs LangGraph
  3. Video: Build Your Own AI Agent System from scratch!
  4. Course: AI Agents in LangGraph
  5. Course: Advanced Retrieval for AI with Chroma
  6. Course: AI Agents in LangGraph
  7. A Comprehensive Guide to Building Agentic RAG Systems with LangGraph
  8. leewayhertz: Agentic RAG
  9. Vectorize: How I finally got agentic RAG to work right
  10. Simple Agentic RAG for Multi Vector stores with LangChain and LangGraph

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support