Spaces:
Runtime error
Runtime error
# GraphRAG Indexer Application | |
## Table of Contents | |
1. [Introduction](#introduction) | |
2. [Setup](#setup) | |
3. [Application Structure](#application-structure) | |
4. [Indexing](#indexing) | |
5. [Prompt Tuning](#prompt-tuning) | |
6. [Data Management](#data-management) | |
7. [Configuration](#configuration) | |
8. [API Integration](#api-integration) | |
9. [Troubleshooting](#troubleshooting) | |
## Introduction | |
The GraphRAG Indexer Application is a Gradio-based user interface for managing the indexing and prompt tuning processes of the GraphRAG (Graph Retrieval-Augmented Generation) system. This application provides an intuitive way to configure, run, and monitor indexing and prompt tuning tasks, as well as manage related data files. | |
## Setup | |
1. Ensure you have Python 3.7+ installed. | |
2. Install required dependencies: | |
``` | |
pip install gradio requests pydantic python-dotenv pyyaml pandas lancedb | |
``` | |
3. Set up environment variables in `indexing/.env`: | |
``` | |
API_BASE_URL=http://localhost:8012 | |
LLM_API_BASE=http://localhost:11434 | |
EMBEDDINGS_API_BASE=http://localhost:11434 | |
ROOT_DIR=indexing | |
``` | |
4. Run the application: | |
``` | |
python index_app.py | |
``` | |
## Application Structure | |
The application is divided into three main tabs: | |
1. Indexing | |
2. Prompt Tuning | |
3. Data Management | |
Each tab provides specific functionality related to its purpose. | |
## Indexing | |
The Indexing tab allows users to configure and run the GraphRAG indexing process. | |
### Features: | |
- Select LLM and Embedding models | |
- Set root directory for indexing | |
- Configure verbose and cache options | |
- Advanced options for resuming, reporting, and output formats | |
- Run indexing and check status | |
### Usage: | |
1. Select the desired LLM and Embedding models from the dropdowns. | |
2. Set the root directory for indexing. | |
3. Configure additional options as needed. | |
4. Click "Run Indexing" to start the process. | |
5. Use "Check Indexing Status" to monitor progress. | |
## Prompt Tuning | |
The Prompt Tuning tab enables users to configure and run prompt tuning for GraphRAG. | |
### Features: | |
- Set root directory and domain | |
- Choose tuning method (random, top, all) | |
- Configure limit, language, max tokens, and chunk size | |
- Option to exclude entity types | |
- Run prompt tuning and check status | |
### Usage: | |
1. Set the root directory and optional domain. | |
2. Choose the tuning method and configure parameters. | |
3. Click "Run Prompt Tuning" to start the process. | |
4. Use "Check Prompt Tuning Status" to monitor progress. | |
## Data Management | |
The Data Management tab provides tools for managing input files and viewing output folders. | |
### Features: | |
- File upload functionality | |
- File list management (view, refresh, delete) | |
- Output folder exploration | |
- File content viewing and editing | |
### Usage: | |
1. Use the File Upload section to add new input files. | |
2. Manage existing files in the File Management section. | |
3. Explore output folders and their contents in the Output Folders section. | |
## Configuration | |
The application uses a combination of environment variables and a `config.yaml` file for configuration. Key settings include: | |
- LLM and Embedding models | |
- API endpoints | |
- Community level for GraphRAG | |
- Token limits | |
- API keys and types | |
To modify these settings, edit the `.env` file or create a `config.yaml` file in the root directory. | |
## API Integration | |
The application integrates with a backend API for executing indexing and prompt tuning tasks. Key API endpoints used: | |
- `/v1/index`: Start indexing process | |
- `/v1/index_status`: Check indexing status | |
- `/v1/prompt_tune`: Start prompt tuning process | |
- `/v1/prompt_tune_status`: Check prompt tuning status | |
These endpoints are called using the `requests` library, with appropriate error handling and logging. | |
## Troubleshooting | |
Common issues and solutions: | |
1. **Model loading fails**: Ensure the LLM_API_BASE is correctly set and the API is accessible. | |
2. **Indexing or Prompt Tuning doesn't start**: Check API connectivity and verify that all required fields are filled. | |
3. **File management issues**: Ensure proper read/write permissions in the ROOT_DIR. | |
For any persistent issues, check the application logs (visible in the console) for detailed error messages. |