Spaces:
Sleeping
Sleeping
metadata
title: Data Analyst Agent
emoji: π§
colorFrom: indigo
colorTo: blue
sdk: docker
pinned: false
app_port: 7860
Data Analyst Agent
This project implements a Data Analyst Agent that can interpret natural-language questions, fetch or process data, and return structured answers. It is designed for the Tools in Data Science course (TDS Project 2). This used to be deployed on Railway.
The agent runs as a web service and provides an automated workflow that:
- Accepts a plain text query (
question.txt
) from the user. - Uses a large language model (LLM) to break down the task into subtasks.
- Dynamically generates and executes Python code for each subtask.
- Returns the final answer in a structured JSON format (plots, tables, or text, depending on the question).
π Features
- Natural Language Input: Users provide a query in plain English (e.g., "Plot the monthly trend of stock prices for Apple").
- Automated Task Breakdown: The agent decomposes queries into data sourcing, analysis, and visualization steps.
- Dynamic Code Generation: Subtasks are executed using generated Python code in a controlled environment.
- Flexible Output: Responses adapt to the query (JSON objects, base64-encoded plots, tables, or text summaries).
- Containerized Deployment: Ready to run via Docker and deployable on Heroku.
π Repository Structure
app.py
β Main FastAPI app that handles requests and responses.index.html
β Simple front-end interface for testing queries.requirements.txt
β Python dependencies.Dockerfile
β Containerization setup.Procfile
β Deployment instructions for Heroku.runtime.txt
β Python runtime version for deployment.entrypoint.sh
β Script to start the application.README.md
β Project documentation (this file).
βοΈ Installation & Setup
Prerequisites
- Python 3.9+
- Docker (optional, for containerized deployment)
- An LLM API key (e.g., OpenAI, Gemini, or other local model as configured)
Local Setup
# Clone the repository
git clone https://github.com/Aditya-Darshan-cyber/TDS-Project-2---Data-Analyst-Agent.git
cd TDS-Project-2---Data-Analyst-Agent
# Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run the application
python app.py
The app will be available at http://127.0.0.1:8000
.
Docker Setup
# Build the Docker image
docker build -t data-analyst-agent .
# Run the container
docker run -p 8000:8000 data-analyst-agent
π Usage
Prepare a plain text file (
question.txt
) with your query, e.g.:Show me a histogram of housing prices by city.
Send the query to the API:
curl -X POST -F "file=@question.txt" http://127.0.0.1:8000/analyze
Receive a JSON response with structured results.
- If the task involves plotting, the response will include a base64-encoded image.
- If it involves summarization, the response will be text or tables.
π Example Workflow
flowchart TD
A[User uploads question.txt] --> B[FastAPI receives request]
B --> C[LLM breaks down task into subtasks]
C --> D[Python code dynamically generated]
D --> E[Code executed in sandboxed environment]
E --> F[Results formatted as JSON]
F --> G[Response returned to user]
Example
Input (question.txt
):
Get the average closing price of Apple stock in 2023 and plot the monthly trend.
Output (JSON response):
{
"average_price": 172.54,
"monthly_plot": "base64encodedimage==",
"status": "success"
}
π οΈ Tech Stack
- Backend: FastAPI
- LLM Integration: Configurable via
.env
for API keys - Deployment: Docker + Heroku
- Language: Python 3.9+
π License
This project is licensed under the MIT License.