Zero_to_Hero_in_Machine_Learning / pages /Support vector machine.py
pranayreddy316's picture
Create Support vector machine.py
1d0b3ce verified
import streamlit as st
st.set_page_config(page_title="SVM Theory App", layout="centered")
st.title("📘 Support Vector Machine (SVM) - Theoretical Overview")
# Section: What is SVM
st.header("🧠 What is SVM?")
st.write("""
Support Vector Machine (SVM) is a supervised learning algorithm used for classification and regression tasks.
It tries to find the optimal boundary (called a hyperplane) that separates different classes of data with the **maximum margin**.
""")
# Section: Linearly Separable Case
st.header("📈 Linearly Separable Case")
st.write("""
If the data can be separated by a straight line (in 2D) or a hyperplane (in higher dimensions), SVM finds the one that maximizes the margin.
**Equation of the hyperplane:**
$$
\\mathbf{w}^T \\mathbf{x} + b = 0
$$
Where:
- \\( \\mathbf{w} \\): weight vector
- \\( b \\): bias
- \\( \\mathbf{x} \\): input feature vector
For correct classification:
$$
y_i (\\mathbf{w}^T \\mathbf{x}_i + b) \\geq 1
$$
""")
# Section: Soft Margin
st.header("🛑 Soft Margin SVM")
st.write("""
When perfect separation isn't possible, SVM introduces a soft margin. It allows some misclassifications using a **regularization parameter C**.
- **Small C** → wider margin, more tolerance → better generalization.
- **Large C** → narrow margin, less tolerance → overfitting risk.
""")
# Section: Kernel Trick
st.header("🔁 Kernel Trick")
st.write("""
SVM can handle **non-linearly separable data** by projecting it into higher-dimensional space using kernels.
**Kernel function** computes similarity without explicitly transforming the data.
Common kernels:
- **Linear**: \\( K(x, y) = x^T y \\)
- **Polynomial**: \\( K(x, y) = (x^T y + c)^d \\)
- **RBF (Gaussian)**: \\( K(x, y) = e^{-\\gamma \\|x - y\\|^2} \\)
- **Sigmoid**: \\( K(x, y) = \\tanh(\\alpha x^T y + c) \\)
""")
# Section: Optimization
with st.expander("📐 Dual Optimization Problem (For Math Curious Folks)"):
st.latex(r"""
\max_{\alpha} \sum_i \alpha_i - \frac{1}{2} \sum_i \sum_j \alpha_i \alpha_j y_i y_j K(x_i, x_j)
""")
st.markdown("""
Subject to:
- \\( \\sum_i \\alpha_i y_i = 0 \\)
- \\( 0 \\leq \\alpha_i \\leq C \\)
Here, \\( \\alpha_i \\) are Lagrange multipliers. The optimization is solved using Quadratic Programming.
""")
# Section: Support Vectors
st.header("🧷 Support Vectors")
st.write("""
Support vectors are the data points that lie closest to the decision boundary. These points define the hyperplane.
Only the support vectors influence the final decision boundary. Other points can be removed without changing it.
""")
# Section: SVM vs Logistic Regression
st.header("🆚 SVM vs Logistic Regression")
st.table({
"Aspect": ["Objective", "Handles Non-Linearity", "Probabilities", "Works with Kernels"],
"SVM": ["Maximize margin", "✅ Yes", "❌ No (but can be calibrated)", "✅ Yes"],
"Logistic Regression": ["Maximize likelihood", "❌ No", "✅ Yes", "❌ No"]
})
# Section: Pros and Cons
st.header("✅ Pros and ❌ Cons")
st.markdown("""
**Pros:**
- Great for high-dimensional spaces
- Effective for non-linear problems with the right kernel
- Robust to overfitting
**Cons:**
- Slow for large datasets
- Sensitive to kernel and parameter choices
- Not very interpretable
""")
# Section: Applications
st.header("💡 Real-World Applications")
st.write("""
- Email spam detection
- Handwriting recognition (e.g. digits)
- Image and face classification
- Cancer diagnosis
- Intrusion detection
""")
st.markdown("---")
st.success("Want to see this in action? You can add a classification demo with SVM below!")