Spaces:
Runtime error
Runtime error
| # GraphRAG Indexer Application | |
| ## Table of Contents | |
| 1. [Introduction](#introduction) | |
| 2. [Setup](#setup) | |
| 3. [Application Structure](#application-structure) | |
| 4. [Indexing](#indexing) | |
| 5. [Prompt Tuning](#prompt-tuning) | |
| 6. [Data Management](#data-management) | |
| 7. [Configuration](#configuration) | |
| 8. [API Integration](#api-integration) | |
| 9. [Troubleshooting](#troubleshooting) | |
| ## Introduction | |
| The GraphRAG Indexer Application is a Gradio-based user interface for managing the indexing and prompt tuning processes of the GraphRAG (Graph Retrieval-Augmented Generation) system. This application provides an intuitive way to configure, run, and monitor indexing and prompt tuning tasks, as well as manage related data files. | |
| ## Setup | |
| 1. Ensure you have Python 3.7+ installed. | |
| 2. Install required dependencies: | |
| ``` | |
| pip install gradio requests pydantic python-dotenv pyyaml pandas lancedb | |
| ``` | |
| 3. Set up environment variables in `indexing/.env`: | |
| ``` | |
| API_BASE_URL=http://localhost:8012 | |
| LLM_API_BASE=http://localhost:11434 | |
| EMBEDDINGS_API_BASE=http://localhost:11434 | |
| ROOT_DIR=indexing | |
| ``` | |
| 4. Run the application: | |
| ``` | |
| python index_app.py | |
| ``` | |
| ## Application Structure | |
| The application is divided into three main tabs: | |
| 1. Indexing | |
| 2. Prompt Tuning | |
| 3. Data Management | |
| Each tab provides specific functionality related to its purpose. | |
| ## Indexing | |
| The Indexing tab allows users to configure and run the GraphRAG indexing process. | |
| ### Features: | |
| - Select LLM and Embedding models | |
| - Set root directory for indexing | |
| - Configure verbose and cache options | |
| - Advanced options for resuming, reporting, and output formats | |
| - Run indexing and check status | |
| ### Usage: | |
| 1. Select the desired LLM and Embedding models from the dropdowns. | |
| 2. Set the root directory for indexing. | |
| 3. Configure additional options as needed. | |
| 4. Click "Run Indexing" to start the process. | |
| 5. Use "Check Indexing Status" to monitor progress. | |
| ## Prompt Tuning | |
| The Prompt Tuning tab enables users to configure and run prompt tuning for GraphRAG. | |
| ### Features: | |
| - Set root directory and domain | |
| - Choose tuning method (random, top, all) | |
| - Configure limit, language, max tokens, and chunk size | |
| - Option to exclude entity types | |
| - Run prompt tuning and check status | |
| ### Usage: | |
| 1. Set the root directory and optional domain. | |
| 2. Choose the tuning method and configure parameters. | |
| 3. Click "Run Prompt Tuning" to start the process. | |
| 4. Use "Check Prompt Tuning Status" to monitor progress. | |
| ## Data Management | |
| The Data Management tab provides tools for managing input files and viewing output folders. | |
| ### Features: | |
| - File upload functionality | |
| - File list management (view, refresh, delete) | |
| - Output folder exploration | |
| - File content viewing and editing | |
| ### Usage: | |
| 1. Use the File Upload section to add new input files. | |
| 2. Manage existing files in the File Management section. | |
| 3. Explore output folders and their contents in the Output Folders section. | |
| ## Configuration | |
| The application uses a combination of environment variables and a `config.yaml` file for configuration. Key settings include: | |
| - LLM and Embedding models | |
| - API endpoints | |
| - Community level for GraphRAG | |
| - Token limits | |
| - API keys and types | |
| To modify these settings, edit the `.env` file or create a `config.yaml` file in the root directory. | |
| ## API Integration | |
| The application integrates with a backend API for executing indexing and prompt tuning tasks. Key API endpoints used: | |
| - `/v1/index`: Start indexing process | |
| - `/v1/index_status`: Check indexing status | |
| - `/v1/prompt_tune`: Start prompt tuning process | |
| - `/v1/prompt_tune_status`: Check prompt tuning status | |
| These endpoints are called using the `requests` library, with appropriate error handling and logging. | |
| ## Troubleshooting | |
| Common issues and solutions: | |
| 1. **Model loading fails**: Ensure the LLM_API_BASE is correctly set and the API is accessible. | |
| 2. **Indexing or Prompt Tuning doesn't start**: Check API connectivity and verify that all required fields are filled. | |
| 3. **File management issues**: Ensure proper read/write permissions in the ROOT_DIR. | |
| For any persistent issues, check the application logs (visible in the console) for detailed error messages. |