discover_rag / README_old.md
joelg's picture
README
5fffa7e

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

πŸŽ“ RAG Pedagogical Demo

A pedagogical web application demonstrating Retrieval Augmented Generation (RAG) systems for students and learners.

🌟 Features

  • Bilingual Interface (English/French)
  • Document Processing: Upload PDF documents or use default corpus
  • Configurable Retrieval:
    • Choose embedding models
    • Adjust chunk size and overlap
    • Set top-k and similarity thresholds
  • Configurable Generation:
    • Select different LLMs
    • Adjust temperature and max tokens
  • Educational Visualization:
    • View retrieved chunks with similarity scores
    • See the exact prompt sent to the LLM
    • Understand each step of the RAG pipeline

πŸš€ Quick Start

Local Installation

# Clone the repository
git clone <your-repo-url>
cd RAG_pedago

# Install dependencies
pip install -r requirements.txt

# Run the application
python app.py

HuggingFace Spaces

This application is designed to run on HuggingFace Spaces with ZeroGPU support.

  1. Create a new Space on HuggingFace
  2. Select "Gradio" as the SDK
  3. Enable ZeroGPU in Space settings
  4. Upload all files from this repository
  5. The app will automatically deploy

πŸ“š Usage

1. Corpus Management

  • Upload your own PDF document or use the included default corpus about RAG
  • Configure chunk size (100-1000 characters) and overlap (0-200 characters)
  • Process the corpus to create embeddings

2. Retrieval Configuration

  • Choose an embedding model:
    • all-MiniLM-L6-v2: Fast, lightweight
    • all-mpnet-base-v2: Better quality, slower
    • paraphrase-multilingual-MiniLM-L12-v2: Multilingual support
  • Set top-k (1-10): Number of chunks to retrieve
  • Set similarity threshold (0.0-1.0): Minimum similarity score

3. Generation Configuration

  • Select a language model:
    • zephyr-7b-beta: Fast, good quality
    • Mistral-7B-Instruct-v0.2: High quality
    • Llama-2-7b-chat-hf: Alternative option
  • Adjust temperature (0.0-2.0): Controls creativity
  • Set max tokens (50-1000): Response length

4. Query & Results

  • Enter your question
  • Use example questions to get started
  • View the generated answer
  • Examine retrieved chunks with similarity scores
  • Inspect the prompt sent to the LLM

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PDF Document   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Text Chunking  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Embeddings    │◄──── Embedding Model
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  FAISS Index    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  User Query     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Retrieval     │──► Top-K Chunks
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Generation    │◄──── Language Model
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Answer      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Technical Stack

  • Framework: Gradio 4.44.0
  • Embeddings: Sentence Transformers
  • Vector Store: FAISS
  • LLMs: HuggingFace Inference API
  • GPU: HuggingFace ZeroGPU
  • PDF Processing: PyPDF2

πŸ“ Files Structure

RAG_pedago/
β”œβ”€β”€ app.py                 # Main Gradio interface
β”œβ”€β”€ rag_system.py         # Core RAG logic
β”œβ”€β”€ i18n.py               # Internationalization
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ default_corpus.pdf    # Default corpus about RAG
β”œβ”€β”€ default_corpus.txt    # Source text for default corpus
└── README.md            # This file

🎯 Educational Goals

This application helps students understand:

  1. Document Processing: How text is split into chunks
  2. Embeddings: How text is converted to vectors
  3. Similarity Search: How relevant information is retrieved
  4. Prompt Engineering: How context is provided to LLMs
  5. Generation: How LLMs produce answers based on retrieved context
  6. Parameter Impact: How different settings affect results

πŸ”§ Configuration for HuggingFace Spaces

Create a README.md in your Space with this header:

---
title: RAG Pedagogical Demo
emoji: πŸŽ“
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---

🀝 Contributing

Contributions are welcome! Feel free to:

  • Add more embedding models
  • Include additional LLMs
  • Improve the interface
  • Add more visualizations
  • Enhance documentation

πŸ“„ License

MIT License - Feel free to use this for educational purposes.

πŸ™ Acknowledgments

  • HuggingFace for the Spaces platform and ZeroGPU
  • Sentence Transformers for embeddings
  • FAISS for efficient similarity search
  • Gradio for the interface framework

πŸ“§ Contact

For questions or feedback, please open an issue on GitHub.