Yago Bolivar commited on
Commit
87aad23
·
1 Parent(s): 3a78b26

feat: add Getting Started, Local Testing, and Next Steps guides for GAIA Agent development

Browse files
Files changed (3) hide show
  1. GETTING_STARTED.md +65 -0
  2. LOCAL_TESTING.md +77 -0
  3. NEXT_STEPS.md +44 -0
GETTING_STARTED.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Getting Started with GAIA Agent Development
2
+
3
+ This guide will help you get started with developing the GAIA Agent using your existing virtual environment.
4
+
5
+ ## Prerequisites
6
+
7
+ - Python 3.8+
8
+ - Virtual environment (already in `.venv`)
9
+ - Hugging Face account (for deployment)
10
+
11
+ ## Setup and Installation
12
+
13
+ 1. **Activate your existing virtual environment**:
14
+ ```bash
15
+ source .venv/bin/activate
16
+ ```
17
+
18
+ 2. **Install the required dependencies**:
19
+ ```bash
20
+ pip install -r requirements.txt
21
+ ```
22
+
23
+ 3. **Install additional packages for the agent**:
24
+ ```bash
25
+ pip install gpt4all beautifulsoup4 pandas pillow python-dotenv searchapi
26
+ ```
27
+
28
+ ## Development Workflow
29
+
30
+ 1. **Local Testing**:
31
+ ```bash
32
+ python app_local.py
33
+ ```
34
+ This will run a local version of the agent with a limited question set for testing.
35
+
36
+ 2. **Running the full agent**:
37
+ ```bash
38
+ python app2.py
39
+ ```
40
+ Note: This requires Hugging Face authentication when running locally.
41
+
42
+ 3. **Evaluating the agent**:
43
+ ```bash
44
+ python utilities/evaluate_local.py
45
+ ```
46
+ This will evaluate your agent against the common questions dataset.
47
+
48
+ ## Project Structure
49
+
50
+ - `app2.py` - The main GAIA agent implementation
51
+ - `app_local.py` - Modified version for local testing without requiring login
52
+ - `devplan.md` - Development plan and architecture design
53
+ - `question_set/` - Contains question datasets for testing
54
+ - `utilities/` - Helper scripts for evaluating and testing
55
+ - `docs/` - Documentation about the API and submission process
56
+
57
+ ## Next Steps
58
+
59
+ See the `NEXT_STEPS.md` file for a checklist of planned improvements.
60
+
61
+ ## Troubleshooting
62
+
63
+ - **Authentication Issues**: For local testing, use `app_local.py` which doesn't require HF login
64
+ - **Missing Dependencies**: Make sure to install all requirements with `pip install -r requirements.txt`
65
+ - **File Not Found Errors**: Create a `dataset` directory for downloaded files
LOCAL_TESTING.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Local Testing Guide for GAIA Agent
2
+
3
+ This document outlines how to test the GAIA agent locally during development.
4
+
5
+ ## Setup
6
+
7
+ 1. Install dependencies:
8
+ ```bash
9
+ pip install -r requirements.txt
10
+ ```
11
+
12
+ 2. If you want to use the OAuth features locally:
13
+ ```bash
14
+ huggingface-cli login
15
+ ```
16
+ Or set the `HF_TOKEN` environment variable with your token from [HF Settings](https://huggingface.co/settings/tokens).
17
+
18
+ ## Running the Application
19
+
20
+ ### Option 1: Simplified Local Testing (Recommended for Development)
21
+
22
+ Use `app_local.py` which has a mock agent and doesn't require OAuth:
23
+
24
+ ```bash
25
+ python app_local.py
26
+ ```
27
+
28
+ Or use the helper script:
29
+
30
+ ```bash
31
+ bash run_local.sh
32
+ ```
33
+
34
+ This will:
35
+ - Install required dependencies
36
+ - Run the local version of the app
37
+ - Use a mock agent that returns test responses
38
+ - Use local sample questions without making API calls
39
+ - Not submit any answers to the actual API
40
+
41
+ ### Option 2: Full Application with Test Username
42
+
43
+ If you want to test the full application but without requiring login:
44
+
45
+ ```bash
46
+ python app2.py
47
+ ```
48
+
49
+ When the application loads:
50
+ 1. Enter a test username in the "Or enter test username for local development" field
51
+ 2. Click "Run Evaluation & Submit All Answers"
52
+
53
+ ### Option 3: Full Application with OAuth
54
+
55
+ To test the complete application with OAuth authentication:
56
+
57
+ 1. Make sure you're logged in to Hugging Face CLI: `huggingface-cli login`
58
+ 2. Run: `python app.py` or `python app2.py`
59
+ 3. Click the "Login" button in the interface
60
+ 4. After logging in, click "Run Evaluation & Submit All Answers"
61
+
62
+ ## Debugging
63
+
64
+ If you encounter OAuth-related errors:
65
+ 1. Check if you're logged in with `huggingface-cli whoami`
66
+ 2. Try setting your Hugging Face token as an environment variable:
67
+ ```
68
+ export HF_TOKEN=your_token_here
69
+ ```
70
+ 3. Use the local testing version (`app_local.py`) which avoids OAuth entirely
71
+
72
+ ## Next Steps
73
+
74
+ 1. Replace the mock agent in `app_local.py` with your real agent implementation
75
+ 2. Test with a small set of sample questions before scaling up
76
+ 3. Gradually add and test tools (web search, file reader, etc.)
77
+ 4. When ready, deploy to Hugging Face Spaces for full evaluation
NEXT_STEPS.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Next Steps for GAIA Agent Development
2
+
3
+ ## Current Status
4
+ - ✅ Created basic agent structure (`app2.py`)
5
+ - ✅ Set up local testing environment (`app_local.py`)
6
+ - ✅ Fixed question format handling
7
+ - ✅ Tested local environment functionality
8
+
9
+ ## High Priority Tasks
10
+
11
+ ### 1. LLM Integration
12
+ - [ ] Add GPT4All with Llama 3 integration
13
+ - [ ] Update system prompts for proper GAIA answer formatting
14
+ - [ ] Implement proper reasoning and answer extraction
15
+
16
+ ### 2. Core Tool Implementation
17
+ - [ ] Web Search Tool (using SerpAPI, Google Custom Search API, or similar)
18
+ - [ ] File Reader Tool (handling different file formats)
19
+ - [ ] Text-based files (.txt, .py, .md)
20
+ - [ ] Images (.png, .jpg) with vision model
21
+ - [ ] Audio (.mp3) with speech-to-text
22
+ - [ ] Spreadsheets (.xlsx) with pandas
23
+ - [ ] Code Interpreter Tool (safe Python execution)
24
+
25
+ ### 3. Question Analysis & Planning
26
+ - [ ] Use LLM for question classification
27
+ - [ ] Implement multi-step reasoning for complex questions
28
+ - [ ] Handle file references in questions
29
+
30
+ ### 4. Testing & Evaluation
31
+ - [ ] Create test cases for each question type
32
+ - [ ] Use `utilities/evaluate_local.py` to evaluate performance
33
+ - [ ] Track accuracy improvements
34
+
35
+ ## Dependencies to add
36
+ - [ ] `gpt4all` for LLM
37
+ - [ ] `beautifulsoup4` for web scraping (if needed)
38
+ - [ ] `pandas` for spreadsheet handling
39
+ - [ ] Vision and speech-to-text libraries (TBD)
40
+
41
+ ## Notes
42
+ - The GPT4All model path seems to be: "/Users/yagoairm2/Library/Application Support/nomic.ai/GPT4All/Meta-Llama-3-8B-Instruct.Q4_0.gguf"
43
+ - Use the `common_questions.json` for testing
44
+ - Follow GAIA evaluation criteria for exact answer matching