Zero_to_Hero_in_Machine_Learning / pages /THE ENSEMBLE METHODS.py
pranayreddy316's picture
Rename pages/THE ENSEMBLE METHPODS.py to pages/THE ENSEMBLE METHODS.py
ba969a6 verified
import streamlit as st
st.set_page_config(page_title="Bagging & Random Forest")
st.title("🌟 Ensemble Learning: Bagging & Random Forest")
# Topic selector
technique = st.radio("🔍 Choose a Technique", [
"👜 Bagging (Bootstrap Aggregation)",
"🌳 Random Forest"
], horizontal=True)
if technique == "👜 Bagging (Bootstrap Aggregation)":
st.header("👜 Bagging (Bootstrap Aggregation)")
st.markdown("""
**Bagging** stands for *Bootstrap Aggregating*. It is an ensemble method that improves model stability and accuracy by training multiple models on different subsets of the data.
### 🔍 How it Works:
- Multiple models are trained on different **bootstrapped samples** (random samples with replacement).
- Each model gives a prediction, and the results are **aggregated** (e.g., majority vote for classification, average for regression).
### ✅ Use Cases:
- High-variance models (e.g., Decision Trees)
- When overfitting is a concern
### 🔧 Key Parameters:
- `n_estimators`: Number of base models
- `max_samples`: Number of samples per model
- `bootstrap`: Whether to sample with replacement
### ✅ Pros:
- Reduces variance
- Prevents overfitting
- Easy to parallelize
### ⚠️ Cons:
- Can be computationally expensive
- Doesn't reduce bias
""")
elif technique == "🌳 Random Forest":
st.header("🌳 Random Forest")
st.markdown("""
**Random Forest** is a popular ensemble method built on top of Bagging, with an extra layer of randomness.
### 🔍 How it Works:
- Uses **Bagging** with Decision Trees.
- Adds randomness by selecting a random subset of features at each split (not just data samples).
### ✅ Use Cases:
- Classification and Regression
- Feature selection and importance scoring
### 🔧 Key Parameters:
- `n_estimators`: Number of trees in the forest
- `max_depth`: Maximum tree depth
- `max_features`: Number of features to consider at each split
- `bootstrap`: Whether to use bootstrapped samples
### ✅ Pros:
- Reduces variance and overfitting
- Handles high-dimensional data well
- Provides feature importance
### ⚠️ Cons:
- Slower for large forests
- Less interpretable than single decision trees
""")
# Common footer note
st.markdown("""
---
📌 **Tip**: Bagging works best with models that have high variance, and Random Forest is a powerful extension of this idea that works exceptionally well on structured/tabular data.
""")