File size: 2,162 Bytes
1db187e
6144ebf
ed8f946
6144ebf
 
 
 
 
 
1db187e
a0155bf
 
c7c0038
 
 
a0155bf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26db9ca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
afc8b97
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
title: Insights
emoji: 📈
colorFrom: gray
colorTo: yellow
sdk: streamlit
sdk_version: 1.33.0
app_file: app.py
pinned: false
---
# Insights

## Deployment
[HuggingFace](https://huggingface.co/spaces/AtharvaThakur/Insights)

## Modules

- `DataLoader`: Handles the loading of data either by uploading a CSV file or inputting a URL to a CSV file.
- `DataAnalyzer`: Provides summary statistics and data types of the loaded dataset.
- `DataFilter`: Allows users to filter rows based on user-defined conditions.
- `DataTransformer`: Enables users to perform operations on columns.
- `DataVisualizer`: Visualizes data with various types of plots (Histogram, Box Plot, Pie Chart, Scatter Plot, Heatmap).

## Features

- Upload CSV files or load data from a URL.
- Display the uploaded dataset.
- Show summary statistics and data types.
- Filter rows based on user-defined conditions.
- Perform operations on columns.
- Visualize data with various types of plots (Histogram, Box Plot, Pie Chart, Scatter Plot, Heatmap).
- Transform data.

## Detailed Installation Instructions

1. Install the required packages:
   The project's dependencies are listed in the 'requirements.txt' file. You can install all of them using pip:
   ```
   pip install -r requirements.txt
   ```
2. Run the application:
   Now, you're ready to run the application. Use the following command to start the Streamlit server:
   ```
   streamlit run app.py
   ```

## Web app
1. Main page
   Data Exploration
   -> Data Loader
   -> DataQA (LLM with python interpreter/CSV agent)
   -> Data Analyzer
   -> Data Filter
   -> Data Visualizer

2. Data Transformation
   -> handling null values
   -> creating new columns
   -> removing columns
   -> Changing datatypes
   -> give option to analyse the transformed dataset or save it.

3. Natural language dataparty (Pure LLM)
   -> Insights generation
   -> Automating the data analysis/transformation
   -> generating a report


# Running using Docker
1. Build the docker image using 
   ```
   docker build -t insights .
   ```
2. Run the Docker container with
   ```
   docker run -p 8501:8501 -e GOOGLE_API_KEY=<you-api-key> insights
   ```