discover_rag / README_old.md
joelg's picture
README
5fffa7e
# πŸŽ“ RAG Pedagogical Demo
A pedagogical web application demonstrating Retrieval Augmented Generation (RAG) systems for students and learners.
## 🌟 Features
- **Bilingual Interface** (English/French)
- **Document Processing**: Upload PDF documents or use default corpus
- **Configurable Retrieval**:
- Choose embedding models
- Adjust chunk size and overlap
- Set top-k and similarity thresholds
- **Configurable Generation**:
- Select different LLMs
- Adjust temperature and max tokens
- **Educational Visualization**:
- View retrieved chunks with similarity scores
- See the exact prompt sent to the LLM
- Understand each step of the RAG pipeline
## πŸš€ Quick Start
### Local Installation
```bash
# Clone the repository
git clone <your-repo-url>
cd RAG_pedago
# Install dependencies
pip install -r requirements.txt
# Run the application
python app.py
```
### HuggingFace Spaces
This application is designed to run on HuggingFace Spaces with ZeroGPU support.
1. Create a new Space on HuggingFace
2. Select "Gradio" as the SDK
3. Enable ZeroGPU in Space settings
4. Upload all files from this repository
5. The app will automatically deploy
## πŸ“š Usage
### 1. Corpus Management
- Upload your own PDF document or use the included default corpus about RAG
- Configure chunk size (100-1000 characters) and overlap (0-200 characters)
- Process the corpus to create embeddings
### 2. Retrieval Configuration
- Choose an embedding model:
- `all-MiniLM-L6-v2`: Fast, lightweight
- `all-mpnet-base-v2`: Better quality, slower
- `paraphrase-multilingual-MiniLM-L12-v2`: Multilingual support
- Set top-k (1-10): Number of chunks to retrieve
- Set similarity threshold (0.0-1.0): Minimum similarity score
### 3. Generation Configuration
- Select a language model:
- `zephyr-7b-beta`: Fast, good quality
- `Mistral-7B-Instruct-v0.2`: High quality
- `Llama-2-7b-chat-hf`: Alternative option
- Adjust temperature (0.0-2.0): Controls creativity
- Set max tokens (50-1000): Response length
### 4. Query & Results
- Enter your question
- Use example questions to get started
- View the generated answer
- Examine retrieved chunks with similarity scores
- Inspect the prompt sent to the LLM
## πŸ—οΈ Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PDF Document β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Text Chunking β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Embeddings │◄──── Embedding Model
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ FAISS Index β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ User Query β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Retrieval │──► Top-K Chunks
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Generation │◄──── Language Model
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Answer β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## πŸ› οΈ Technical Stack
- **Framework**: Gradio 4.44.0
- **Embeddings**: Sentence Transformers
- **Vector Store**: FAISS
- **LLMs**: HuggingFace Inference API
- **GPU**: HuggingFace ZeroGPU
- **PDF Processing**: PyPDF2
## πŸ“ Files Structure
```
RAG_pedago/
β”œβ”€β”€ app.py # Main Gradio interface
β”œβ”€β”€ rag_system.py # Core RAG logic
β”œβ”€β”€ i18n.py # Internationalization
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ default_corpus.pdf # Default corpus about RAG
β”œβ”€β”€ default_corpus.txt # Source text for default corpus
└── README.md # This file
```
## 🎯 Educational Goals
This application helps students understand:
1. **Document Processing**: How text is split into chunks
2. **Embeddings**: How text is converted to vectors
3. **Similarity Search**: How relevant information is retrieved
4. **Prompt Engineering**: How context is provided to LLMs
5. **Generation**: How LLMs produce answers based on retrieved context
6. **Parameter Impact**: How different settings affect results
## πŸ”§ Configuration for HuggingFace Spaces
Create a `README.md` in your Space with this header:
```yaml
---
title: RAG Pedagogical Demo
emoji: πŸŽ“
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---
```
## 🀝 Contributing
Contributions are welcome! Feel free to:
- Add more embedding models
- Include additional LLMs
- Improve the interface
- Add more visualizations
- Enhance documentation
## πŸ“„ License
MIT License - Feel free to use this for educational purposes.
## πŸ™ Acknowledgments
- HuggingFace for the Spaces platform and ZeroGPU
- Sentence Transformers for embeddings
- FAISS for efficient similarity search
- Gradio for the interface framework
## πŸ“§ Contact
For questions or feedback, please open an issue on GitHub.