Sathwikchowdary commited on
Commit
ad9bdc8
·
verified ·
1 Parent(s): c69351c

Update pages/3Ensemble_Techniques.py

Browse files
Files changed (1) hide show
  1. pages/3Ensemble_Techniques.py +159 -0
pages/3Ensemble_Techniques.py CHANGED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+
3
+ # Page configuration
4
+ st.set_page_config(page_title="Ensemble Techniques", page_icon="🤖", layout="wide")
5
+
6
+ # Custom styling
7
+ st.markdown("""
8
+ <style>
9
+ .stApp {
10
+ background-color: #f2f6fa;
11
+ }
12
+ h1, h2, h3 {
13
+ color: #1a237e;
14
+ }
15
+ .custom-font, p, li {
16
+ font-family: 'Arial', sans-serif;
17
+ font-size: 18px;
18
+ color: #212121;
19
+ line-height: 1.6;
20
+ }
21
+ </style>
22
+ """, unsafe_allow_html=True)
23
+
24
+ # Title
25
+ st.markdown("<h1>Ensemble Learning Techniques</h1>", unsafe_allow_html=True)
26
+
27
+ # Introduction
28
+ st.markdown("""
29
+ Ensemble learning is a strategy in machine learning where **multiple models**—called base models—are combined to produce a more accurate and robust **ensemble model**. The core idea is that a group of diverse models often performs better than any individual model alone.
30
+ """, unsafe_allow_html=True)
31
+
32
+ st.markdown("**Assumption:** The base models should be **diverse**. If they are too similar, the overall ensemble may lose its advantage and yield poor results.")
33
+
34
+ # Types of Ensemble
35
+ st.markdown("<h2>Types of Ensemble Techniques</h2>", unsafe_allow_html=True)
36
+ st.write("Ensemble techniques vary based on how base models are built and how their outputs are combined.")
37
+ st.image("diff_ensemble_tecniques.png", width=900)
38
+
39
+ # Voting Ensemble
40
+ st.markdown("<h2>1. Voting Ensemble</h2>", unsafe_allow_html=True)
41
+ st.write("Voting is a straightforward ensemble approach suitable for both classification and regression. It aggregates the predictions from multiple models to make the final prediction.")
42
+
43
+ st.write("**Types:**")
44
+ st.write("- **Hard Voting**: Final output is the most frequent class label among base models.")
45
+ st.write("- **Soft Voting**: Uses the average of class probabilities to decide the output.")
46
+
47
+ st.markdown("**Steps for Classification:**")
48
+ st.markdown("""
49
+ 1. Select different base models.
50
+ 2. Train each on the same dataset.
51
+ 3. Gather predictions.
52
+ 4. Use hard or soft voting to finalize.
53
+ """)
54
+
55
+ st.image("voting.jpg", width=900)
56
+
57
+ st.markdown("**Steps for Regression:**")
58
+ st.markdown("""
59
+ 1. Train various regression models.
60
+ 2. Get predictions from all models.
61
+ 3. Calculate the average or median of predictions.
62
+ """)
63
+
64
+ st.markdown("**Important Parameters:**")
65
+ st.markdown("- `voting`: Choose between 'hard' or 'soft' voting\n- `weights`: Assign relative importance to models")
66
+
67
+ # Voting implementation link
68
+ st.markdown("<h2>Voting Implementation Example</h2>", unsafe_allow_html=True)
69
+ st.markdown(
70
+ "<a href='https://colab.research.google.com/drive/1LPZR9RnvEXP8mzOLOBfSVVyHHZ7GFns4?usp=sharing' target='_blank' style='font-size: 16px; color: #1a237e;'>Open Jupyter Notebook</a>",
71
+ unsafe_allow_html=True
72
+ )
73
+
74
+ # Bagging
75
+ st.markdown("<h2>2. Bagging (Bootstrap Aggregating)</h2>", unsafe_allow_html=True)
76
+ st.write("Bagging boosts model performance by training the same algorithm on different random subsets (with replacement) of the dataset.")
77
+
78
+ st.write("Unlike voting, bagging keeps the algorithm fixed and varies the training data to create diverse models.")
79
+
80
+ st.write("**Variants:**")
81
+ st.write("- **Bagging**: General form, any model can be used.")
82
+ st.write("- **Random Forest**: Special form using decision trees with added randomness.")
83
+
84
+ st.image("bagging.jpg", width=900)
85
+
86
+ st.markdown("**Steps for Classification:**")
87
+ st.markdown("""
88
+ 1. Generate bootstrapped samples.
89
+ 2. Train models on each sample.
90
+ 3. Aggregate outputs using majority vote.
91
+ """)
92
+
93
+ st.markdown("**Steps for Regression:**")
94
+ st.markdown("""
95
+ 1. Create random samples from the dataset.
96
+ 2. Train models on each.
97
+ 3. Average the predictions.
98
+ """)
99
+
100
+ st.markdown("<h2>How to Create Bootstrapped Samples</h2>", unsafe_allow_html=True)
101
+ st.write("**Row and Column Sampling** help increase model diversity in bagging.")
102
+
103
+ st.write("**Row Sampling:**")
104
+ st.write("- With Replacement: Duplicates allowed (classic bootstrapping)")
105
+ st.write("- Without Replacement: Unique rows only (pasting)")
106
+
107
+ st.write("**Column Sampling:**")
108
+ st.write("- With Replacement: Some features may repeat.")
109
+ st.write("- Without Replacement: Each feature is used only once per model.")
110
+
111
+ st.markdown("**Important Parameters:**")
112
+ st.markdown("- `n_estimators`: Number of models to train\n- `max_samples`: % of data per model\n- `bootstrap`: Whether sampling is with replacement")
113
+
114
+ # Bagging implementation link
115
+ st.markdown("<h2>Bagging Implementation Example</h2>", unsafe_allow_html=True)
116
+ st.markdown(
117
+ "<a href='https://colab.research.google.com/drive/1cumZl7H9fqyORfaw236WWxQViJxvSKHV?usp=sharing' target='_blank' style='font-size: 16px; color: #1a237e;'>Open Jupyter Notebook</a>",
118
+ unsafe_allow_html=True
119
+ )
120
+
121
+ # Random Forest
122
+ st.markdown("<h2>3. Random Forest</h2>", unsafe_allow_html=True)
123
+ st.write("Random Forest is a popular ensemble method that builds multiple decision trees using bootstrapped samples. It adds another layer of randomness by selecting a subset of features at each split.")
124
+
125
+ st.image("randomforest.jpg", width=900)
126
+
127
+ st.markdown("**Steps for Classification:**")
128
+ st.markdown("""
129
+ 1. Create bootstrapped samples.
130
+ 2. Train decision trees using random feature selection at each split.
131
+ 3. Combine predictions using majority vote.
132
+ """)
133
+
134
+ st.markdown("**Steps for Regression:**")
135
+ st.markdown("""
136
+ 1. Prepare bootstrapped training sets.
137
+ 2. Train decision tree regressors with random feature splits.
138
+ 3. Predict by averaging model outputs.
139
+ """)
140
+
141
+ st.markdown("**Bagging vs Random Forest:**")
142
+ st.markdown("""
143
+ - **Bagging:** Any algorithm, row/column sampling optional
144
+ - **Random Forest:** Uses decision trees only, always samples rows & features
145
+ - **Bagging:** No internal randomness
146
+ - **Random Forest:** Adds randomness via feature selection
147
+ """)
148
+
149
+ # Random Forest implementation link
150
+ st.markdown("<h2>Random Forest Implementation Example</h2>", unsafe_allow_html=True)
151
+ st.markdown(
152
+ "<a href='https://colab.research.google.com/drive/1S6YyfTx9N35E5fpPF0z6ZDm85BSp1deT?usp=sharing' target='_blank' style='font-size: 16px; color: #1a237e;'>Open Jupyter Notebook</a>",
153
+ unsafe_allow_html=True
154
+ )
155
+
156
+ # Conclusion
157
+ st.markdown("""
158
+ Ensemble learning is a powerful approach that enhances model accuracy, reduces overfitting, and improves robustness. Choosing between techniques like **Voting**, **Bagging**, and **Random Forest** depends on your use case and the nature of the data.
159
+ """, unsafe_allow_html=True)