Spaces:

pranayreddy316
/

Zero_to_Hero_in_Machine_Learning

Build error

App Files Files Community

pranayreddy316 commited on Apr 7

Commit

1d0b3ce

verified ·

1 Parent(s): 6f2ec69

Create Support vector machine.py

Browse files

Files changed (1) hide show

pages/Support vector machine.py +114 -0

pages/Support vector machine.py ADDED Viewed

	@@ -0,0 +1,114 @@

+import streamlit as st
+st.set_page_config(page_title="SVM Theory App", layout="centered")
+st.title("📘 Support Vector Machine (SVM) - Theoretical Overview")
+# Section: What is SVM
+st.header("🧠 What is SVM?")
+st.write("""
+Support Vector Machine (SVM) is a supervised learning algorithm used for classification and regression tasks.
+It tries to find the optimal boundary (called a hyperplane) that separates different classes of data with the **maximum margin**.
+""")
+# Section: Linearly Separable Case
+st.header("📈 Linearly Separable Case")
+st.write("""
+If the data can be separated by a straight line (in 2D) or a hyperplane (in higher dimensions), SVM finds the one that maximizes the margin.
+**Equation of the hyperplane:**
+$$
+\\mathbf{w}^T \\mathbf{x} + b = 0
+$$
+Where:
+- \\( \\mathbf{w} \\): weight vector
+- \\( b \\): bias
+- \\( \\mathbf{x} \\): input feature vector
+For correct classification:
+$$
+y_i (\\mathbf{w}^T \\mathbf{x}_i + b) \\geq 1
+$$
+""")
+# Section: Soft Margin
+st.header("🛑 Soft Margin SVM")
+st.write("""
+When perfect separation isn't possible, SVM introduces a soft margin. It allows some misclassifications using a **regularization parameter C**.
+- **Small C** → wider margin, more tolerance → better generalization.
+- **Large C** → narrow margin, less tolerance → overfitting risk.
+""")
+# Section: Kernel Trick
+st.header("🔁 Kernel Trick")
+st.write("""
+SVM can handle **non-linearly separable data** by projecting it into higher-dimensional space using kernels.
+**Kernel function** computes similarity without explicitly transforming the data.
+Common kernels:
+- **Linear**: \\( K(x, y) = x^T y \\)
+- **Polynomial**: \\( K(x, y) = (x^T y + c)^d \\)
+- **RBF (Gaussian)**: \\( K(x, y) = e^{-\\gamma \\|x - y\\|^2} \\)
+- **Sigmoid**: \\( K(x, y) = \\tanh(\\alpha x^T y + c) \\)
+""")
+# Section: Optimization
+with st.expander("📐 Dual Optimization Problem (For Math Curious Folks)"):
+    st.latex(r"""
+    \max_{\alpha} \sum_i \alpha_i - \frac{1}{2} \sum_i \sum_j \alpha_i \alpha_j y_i y_j K(x_i, x_j)
+    """)
+    st.markdown("""
+    Subject to:
+    - \\( \\sum_i \\alpha_i y_i = 0 \\)
+    - \\( 0 \\leq \\alpha_i \\leq C \\)
+    Here, \\( \\alpha_i \\) are Lagrange multipliers. The optimization is solved using Quadratic Programming.
+    """)
+# Section: Support Vectors
+st.header("🧷 Support Vectors")
+st.write("""
+Support vectors are the data points that lie closest to the decision boundary. These points define the hyperplane.
+Only the support vectors influence the final decision boundary. Other points can be removed without changing it.
+""")
+# Section: SVM vs Logistic Regression
+st.header("🆚 SVM vs Logistic Regression")
+st.table({
+    "Aspect": ["Objective", "Handles Non-Linearity", "Probabilities", "Works with Kernels"],
+    "SVM": ["Maximize margin", "✅ Yes", "❌ No (but can be calibrated)", "✅ Yes"],
+    "Logistic Regression": ["Maximize likelihood", "❌ No", "✅ Yes", "❌ No"]
+})
+# Section: Pros and Cons
+st.header("✅ Pros and ❌ Cons")
+st.markdown("""
+**Pros:**
+- Great for high-dimensional spaces
+- Effective for non-linear problems with the right kernel
+- Robust to overfitting
+**Cons:**
+- Slow for large datasets
+- Sensitive to kernel and parameter choices
+- Not very interpretable
+""")
+# Section: Applications
+st.header("💡 Real-World Applications")
+st.write("""
+- Email spam detection
+- Handwriting recognition (e.g. digits)
+- Image and face classification
+- Cancer diagnosis
+- Intrusion detection
+""")
+st.markdown("---")
+st.success("Want to see this in action? You can add a classification demo with SVM below!")