LightRAG: Simple and Fast Retrieval-Augmented Generation
Paper β’ 2410.05779 β’ Published β’ 39
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Production-grade Retrieval-Augmented Generation with hybrid retrieval, graph-based reasoning, and rigorous evaluation
A 3-tier RAG system that goes far beyond basic "vector search β LLM" tutorials:
| Tier | What It Does | How It's Different |
|---|---|---|
| Tier 1: Basic | Dense vector search β LLM | Baseline (what tutorials teach) |
| Tier 2: Hybrid | BM25 + Dense + RRF fusion + Cross-encoder reranking β LLM | Production-grade retrieval |
| Tier 3: Graph | LightRAG knowledge graph + multi-hop reasoning β LLM | Research-grade, multi-hop Q&A |
All three tiers are evaluated with RAGAS metrics to prove the improvements aren't just theoretical.
| Metric | Tier 1 (Basic) | Tier 2 (Hybrid) | Tier 3 (Graph) |
|---|---|---|---|
| Faithfulness | ~X.XX | ~X.XX | ~X.XX |
| Answer Relevancy | ~X.XX | ~X.XX | ~X.XX |
| Context Recall | ~X.XX | ~X.XX | ~X.XX |
| Context Precision | ~X.XX | ~X.XX | ~X.XX |
(Fill in after running evaluation β these numbers go in your resume!)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Advanced RAG Pipeline β
β β
β π Documents β
β β β
β ββββΆ π€ Chunking (recursive, 512 tokens, 50 overlap) β
β β β
β ββββΆ π TIER 1: Dense Retrieval β
β β βββ BGE-small embeddings β FAISS index β Top-K β
β β β
β ββββΆ π TIER 2: Hybrid Retrieval β
β β βββ BM25 (sparse) βββ β
β β βββ BGE (dense) βββββ€ββ RRF Fusion β Cross-encoder β Top-Kβ
β β βββ Reciprocal Rank Fusion β
β β β
β ββββΆ π TIER 3: Graph Retrieval (LightRAG) β
β βββ Entity Extraction β Knowledge Graph β
β βββ Local queries (specific entities) β
β βββ Global queries (abstract themes) β
β βββ Hybrid mode (best of both) β
β β
β β Query β
β β β
β ββββΆ Retrieve relevant contexts (any tier) β
β ββββΆ Rerank with cross-encoder β
β ββββΆ Generate answer with LLM (Groq API / HF Inference) β
β ββββΆ Evaluate with RAGAS (faithfulness, relevancy, recall) β
β β
β π₯οΈ Gradio Interface β
β βββ Chat tab (ask questions) β
β βββ Upload tab (add documents) β
β βββ Compare tab (side-by-side tier comparison) β
β βββ Eval tab (RAGAS scores) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Component | Tool | Why |
|---|---|---|
| Dense Embeddings | BAAI/bge-small-en-v1.5 (33MB) |
Best quality/size ratio, CPU-fast |
| Sparse Retrieval | rank_bm25 |
Classic term-matching, complements dense |
| Fusion | Reciprocal Rank Fusion (RRF) | No tuning needed, robust across domains |
| Reranker | cross-encoder/ms-marco-MiniLM-L6-v2 |
Best CPU reranker (74.3 NDCG@10) |
| Graph RAG | LightRAG (34K GitHub stars) |
Entity-relationship graphs for multi-hop |
| LLM | Groq API (free, Llama 3.3 70B) | Zero cost, fast, high quality |
| Evaluation | RAGAS | Standard RAG evaluation framework |
| Frontend | Gradio β HF Spaces | Free deployment, no GPU needed |
| Vector Store | FAISS (CPU) | Fast, no server needed |
project2_advanced_rag/
βββ README.md # This file
βββ requirements.txt # Dependencies
βββ rag_engine.py # Core RAG engine (all 3 tiers)
βββ evaluation.py # RAGAS evaluation pipeline
βββ app.py # Gradio web interface
βββ ingest_sample_data.py # Download & index sample documents
βββ config.py # Configuration (API keys, model names)
βββ sample_data/ # Sample documents for demo
βββ README.md
# 1. Install dependencies
pip install -r requirements.txt
# 2. Set up API key (free!)
# Go to https://console.groq.com β Get API key
export GROQ_API_KEY="your-key-here"
# 3. Index sample documents
python ingest_sample_data.py
# 4. Launch the app
python app.py
# Opens at http://localhost:7860
# 1. Create a new Space on huggingface.co
# 2. Upload all files
# 3. Add GROQ_API_KEY to Space secrets
# 4. It deploys automatically!
| Service | What For | Link |
|---|---|---|
| Groq | LLM (Llama 3.3 70B) | console.groq.com |
| HuggingFace | Embeddings (optional, runs locally) | huggingface.co/settings/tokens |