genetic_algorithm / pages /4_Decision_Tree_Classification.py
robitalhazmi's picture
Add additional content
4a832b8
raw
history blame
3.85 kB
import streamlit as st
def main():
st.set_page_config(page_title="Decision Tree Classification", layout="wide")
st.title("Decision Tree Classification")
# Introduction to Decision Tree
st.header("The Decision Tree Algorithm")
st.markdown("""
A decision tree is a flowchart-like tree structure where an internal node represents a feature (or attribute), the branch represents a decision rule, and each leaf node represents the outcome.
The topmost node in a decision tree is known as the root node. It learns to partition on the basis of the attribute value. It partitions the tree in a recursive manner called recursive partitioning. This flowchart-like structure helps you in decision-making. It's visualization like a flowchart diagram which easily mimics the human level thinking. That is why decision trees are easy to understand and interpret.
""")
st.image("The Decision Tree Algorithm.png")
st.markdown("""
A decision tree is a white box type of ML algorithm. It shares internal decision-making logic, which is not available in the black box type of algorithms such as with a neural network. Its training time is faster compared to the neural network algorithm.
The time complexity of decision trees is a function of the number of records and attributes in the given data. The decision tree is a distribution-free or non-parametric method which does not depend upon probability distribution assumptions. Decision trees can handle high-dimensional data with good accuracy.
""")
# How Does the Decision Tree Algorithm Work?
st.header("How Does the Decision Tree Algorithm Work?")
st.markdown("""
The basic idea behind any decision tree algorithm is as follows:
- Select the best attribute using Attribute Selection Measures (ASM) to split the records.
- Make that attribute a decision node and breaks the dataset into smaller subsets.
- Start tree building by repeating this process recursively for each child until one of the conditions will match:
- All the tuples belong to the same attribute value.
- There are no more remaining attributes.
- There are no more instances.
""")
st.image("Decision Tree Algorithm.webp")
# Attribute Selection Measures
st.header("Attribute Selection Measures")
st.markdown("""
Attribute selection measure is a heuristic for selecting the splitting criterion that partitions data in the best possible manner. It is also known as splitting rules because it helps us to determine breakpoints for tuples on a given node. ASM provides a rank to each feature (or attribute) by explaining the given dataset. The best score attribute will be selected as a splitting attribute. In the case of a continuous-valued attribute, split points for branches also need to define. The most popular selection measures are Information Gain, Gain Ratio, and Gini Index.
**Gini index**
Another decision tree algorithm CART (Classification and Regression Tree) uses the Gini method to create split points.
""")
st.image("Gini method.webp")
st.markdown("""
The Gini Index considers a binary split for each attribute. You can compute a weighted sum of the impurity of each partition. If a binary split on attribute A partitions data D into D1 and D2, the Gini index of D is calculated, and the attribute with the minimum Gini index is chosen as the splitting attribute.
""")
st.image("The Gini Index.webp")
st.image("smaller gini index.webp")
# YouTube Video for additional content
st.header("Learn More Through This Video")
st.video("https://www.youtube.com/watch?v=_L39rN6gz7Y")
# Add a footer
st.markdown("---")
st.write("Made with ❤️ by Viga, Hanum, & Robit")
if __name__ == "__main__":
main()