YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Agentic RAG
Theoretical review
What is a Retrieval Augmented Generation (RAG)?
Published by Meta AI in NeurIPS 2020 -> Paper.
1. Overview:
- Hybrid Model: Combines the strengths of retrieval-based and generation-based models.
- Purpose: Enhances LLM text generation by incorporating relevant external information, improving accuracy and context.
2. Structure:
- Retriever:
- Function: Searches a large corpus (e.g., documents, articles) to find relevant information based on the input query.
- Techniques: Uses methods like BM25, dense retrieval, or neural retrievers.
- Generator:
- Function: Generates a response using the information retrieved.
- Model: Typically a transformer-based model like GPT-4 or Llama.
3. Workflow:
- Input Query: User provides a query or prompt.
- Document Retrieval:
- The retriever fetches a set of relevant documents or passages.
- These documents provide context and factual information.
- Response Generation:
- The generator uses the retrieved documents to produce a coherent and contextually accurate response.
- Ensures the generated text is informed by the most relevant information available.
4. Benefits:
- Enhanced Accuracy: By grounding responses in real-world data, RAG models significantly improve the accuracy of generated content.
- Reduced Hallucinations: The integration of external knowledge helps mitigate the risk of generating incorrect or nonsensical responses.
- Scalability: RAG systems can handle vast amounts of data, making them suitable for enterprise-level applications.
Pipeline
What is a Agentic-RAG?
1. Overview:
- Enhanced RAG: Extends RAG by adding agent-like capabilities.
- Purpose: Designed to perform tasks autonomously, interacting with various tools and APIs to achieve specific goals.
2. Structure:
- Retriever:
- Function: Similar to RAG, it fetches relevant documents based on the input query.
- Generator:
- Function: Generates an initial response using the retrieved documents.
- Agent Module:
- Function: Evaluates the generated response, cross-references it with the knowledge base, and makes corrections if discrepancies are found.
3. Workflow:
- Input Query: User provides a query/question/task.
- Document Retrieval: The retriever fetches relevant documents to provide context.
- Initial Response Generation: The generator creates a preliminary answer using the retrieved information.
- Response Verification: The agent system assesses the initial response against the knowledge base to ensure accuracy.
- Response Correction (if needed): If inaccuracies are detected, the agent system refines the response to align with verified information.
4. Benefits:
- Improved Reliability: The agent system's verification process ensures responses are accurate and trustworthy.
- Dynamic Correction: Enables real-time adjustments to responses, enhancing the system's adaptability to new information.
- User Trust: By providing verified answers, the system builds greater user confidence in its outputs.
Pipeline
Frameworks recommended for Agents
LangChain
LangGraph
AutoGen
SmolAgent
PydanticAI
Vector Database: Chroma
Frameworks recommended to develop user interfaces.
Code Examples
References
- Video: What is Agentic RAG?
- Video: LangChain vs LangGraph
- Video: Build Your Own AI Agent System from scratch!
- Course: AI Agents in LangGraph
- Course: Advanced Retrieval for AI with Chroma
- Course: AI Agents in LangGraph
- A Comprehensive Guide to Building Agentic RAG Systems with LangGraph
- leewayhertz: Agentic RAG
- Vectorize: How I finally got agentic RAG to work right
- Simple Agentic RAG for Multi Vector stores with LangChain and LangGraph
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
HF Inference deployability: The model has no library tag.