boringnose commited on
Commit
011ad11
β€’
1 Parent(s): ad98fbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -72
README.md CHANGED
@@ -1,72 +1,81 @@
1
- # DataFlow Pro
2
- Automating ML Workflows with Ease
3
-
4
- ## Introduction
5
- The Automated ML is a Python application designed to automate the process of building, tuning, and evaluating machine learning models based on json provided in RTF/JSON?/TXT file format. <br>
6
- This application follows a structured flow to read the json file, extract dataset information, transform features, split data, build and tune models, and evaluate their performance.
7
-
8
-
9
- ## Installation
10
- To use the Automated ML Pipeline, follow these steps:
11
-
12
- 1. Clone this repository to your local machine: <br>
13
- ```git clone https://github.com/Rupanshu-Kapoor/AutomateML.git```
14
-
15
- 2. Install the required dependencies: <br>
16
- `pip install -r requirements.txt`
17
-
18
- 3. Run the application: <br>
19
- `streamlit run app.py`
20
-
21
-
22
- ## Steps to Use the Application:
23
-
24
- You can use the application in following two ways:
25
-
26
- ### (A). Create Json and Train Model
27
-
28
- 1. Upload the dataset on the tool on which you want to train the different model.
29
- 2. Once the data is uploaded, you can preview the dataset.
30
- 3. Select prediction parameters (prediction type, target variable, k-fold, etc.).
31
- 4. Select features to be used for prediction.
32
- 5. When you select any feature, you can choose how to handle it. (rescaling, encoding, etc.)
33
- 6. Select the model to be used for prediction.
34
- 7. When you select any model, you can choose hyperparameters for tuning.
35
- 8. Once all the parameters are selected, click on `Generate Json and Train Model` button.
36
- 9. Application will generate the json file and train the model and display the results.
37
-
38
- ### (B). Upload Json and Train Model
39
- 1. Upload the json file that contains all the dataset information.
40
- 2. Click on Train Models.
41
- 3. Application will train the model and display the results.
42
-
43
- ## Working of the Application:
44
- The application performs the following tasks in sequence:
45
- 1. **Read the JSON File and Parse JSON Content**: The RTF/JSON file is read, converted to plain text, and JSON content is extracted.
46
- 2. **Extract Dataset Information**: Extract dataset information such as feature names, target variable, problem type (regression/classification), feature handling, etc.
47
- 3. **Transform Features**: Features are transformed based on the specified feature handling methods.
48
- 4. **Sample Data and Train-Test Split**: Data is sampled and split into training and testing sets.
49
- 5. **Model Building**: Models are built based on the problem type (regression/classification).
50
- 6. **Hyperparameter Tuning**: Hyperparameters of the models are tuned using grid search.
51
- 7. **Model Evaluation**: Trained models are evaluated using specified evaluation metrics.
52
- <! --8. **Save Results**: Trained models and evaluation metrics are saved in the results/ directory. -->
53
-
54
-
55
- ## Use Cases
56
-
57
- This application can be used for various use cases, including but not limited to:
58
-
59
- - Automated machine learning (AutoML) pipelines.
60
- - Data preprocessing and feature engineering tasks.
61
- - Model training and evaluation for regression or classification problems.
62
- - Hyperparameter tuning and model selection.
63
- - Experimentation with different datasets and configurations.
64
-
65
- ## Future Work
66
- Possible future enhancements for the application include:
67
-
68
- - Adding support for additional data formats (e.g., CSV, Excel).
69
- - Implementing more advanced feature engineering techniques.
70
- - Incorporating more sophisticated model selection and evaluation methods.
71
- - Enhancing the user interface for easier interaction.
72
- - Integrating with external APIs or databases for data retrieval.
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gpl-3.0
3
+ title: DataFlowPro
4
+ sdk: streamlit
5
+ emoji: πŸš€
6
+ colorFrom: purple
7
+ colorTo: pink
8
+ short_description: Automating ML Workflows with Ease
9
+ ---
10
+ # DataFlow Pro
11
+ Automating ML Workflows with Ease
12
+
13
+ ## Introduction
14
+ The Automated ML is a Python application designed to automate the process of building, tuning, and evaluating machine learning models based on json provided in RTF/JSON?/TXT file format. <br>
15
+ This application follows a structured flow to read the json file, extract dataset information, transform features, split data, build and tune models, and evaluate their performance.
16
+
17
+
18
+ ## Installation
19
+ To use the Automated ML Pipeline, follow these steps:
20
+
21
+ 1. Clone this repository to your local machine: <br>
22
+ ```git clone https://github.com/Rupanshu-Kapoor/AutomateML.git```
23
+
24
+ 2. Install the required dependencies: <br>
25
+ `pip install -r requirements.txt`
26
+
27
+ 3. Run the application: <br>
28
+ `streamlit run app.py`
29
+
30
+
31
+ ## Steps to Use the Application:
32
+
33
+ You can use the application in following two ways:
34
+
35
+ ### (A). Create Json and Train Model
36
+
37
+ 1. Upload the dataset on the tool on which you want to train the different model.
38
+ 2. Once the data is uploaded, you can preview the dataset.
39
+ 3. Select prediction parameters (prediction type, target variable, k-fold, etc.).
40
+ 4. Select features to be used for prediction.
41
+ 5. When you select any feature, you can choose how to handle it. (rescaling, encoding, etc.)
42
+ 6. Select the model to be used for prediction.
43
+ 7. When you select any model, you can choose hyperparameters for tuning.
44
+ 8. Once all the parameters are selected, click on `Generate Json and Train Model` button.
45
+ 9. Application will generate the json file and train the model and display the results.
46
+
47
+ ### (B). Upload Json and Train Model
48
+ 1. Upload the json file that contains all the dataset information.
49
+ 2. Click on Train Models.
50
+ 3. Application will train the model and display the results.
51
+
52
+ ## Working of the Application:
53
+ The application performs the following tasks in sequence:
54
+ 1. **Read the JSON File and Parse JSON Content**: The RTF/JSON file is read, converted to plain text, and JSON content is extracted.
55
+ 2. **Extract Dataset Information**: Extract dataset information such as feature names, target variable, problem type (regression/classification), feature handling, etc.
56
+ 3. **Transform Features**: Features are transformed based on the specified feature handling methods.
57
+ 4. **Sample Data and Train-Test Split**: Data is sampled and split into training and testing sets.
58
+ 5. **Model Building**: Models are built based on the problem type (regression/classification).
59
+ 6. **Hyperparameter Tuning**: Hyperparameters of the models are tuned using grid search.
60
+ 7. **Model Evaluation**: Trained models are evaluated using specified evaluation metrics.
61
+ <! --8. **Save Results**: Trained models and evaluation metrics are saved in the results/ directory. -->
62
+
63
+
64
+ ## Use Cases
65
+
66
+ This application can be used for various use cases, including but not limited to:
67
+
68
+ - Automated machine learning (AutoML) pipelines.
69
+ - Data preprocessing and feature engineering tasks.
70
+ - Model training and evaluation for regression or classification problems.
71
+ - Hyperparameter tuning and model selection.
72
+ - Experimentation with different datasets and configurations.
73
+
74
+ ## Future Work
75
+ Possible future enhancements for the application include:
76
+
77
+ - Adding support for additional data formats (e.g., CSV, Excel).
78
+ - Implementing more advanced feature engineering techniques.
79
+ - Incorporating more sophisticated model selection and evaluation methods.
80
+ - Enhancing the user interface for easier interaction.
81
+ - Integrating with external APIs or databases for data retrieval.