Iris Classification Models
This repository starts with a Decision Tree model trained on the classic Iris dataset. The model classifies iris flowers into three species—setosa, versicolor, or virginica—based on four numeric features (sepal length, sepal width, petal length, and petal width).
Because of its small size and simplicity, this model is intended primarily for demonstration and educational purposes.
Model Description
- Framework: Scikit-Learn
- Algorithm: Decision Tree (
DecisionTreeClassifier
class) - Hyperparameters:
- Defaults for Decision Tree in Scikit-Learn
Intended Uses
- Education/Proof-of-Concept: Demonstrates loading a scikit-learn model from the Hugging Face Hub.
- Beginner ML Tutorials: Introduction to classification tasks, usage of Hugging Face model hosting, and deploying simple demos in Spaces.
Limitations
- Dataset Size: The Iris dataset is small (150 samples). Performance metrics may not extrapolate to real-world scenarios.
- Domain Constraints: The dataset only covers three iris species and may not generalize to other types of flowers.
- Not Production-Ready: This model is not suited for critical applications (e.g., healthcare, autonomous vehicles).
How to Use
To use this model, you can load the .joblib
file from the Hub in Python code:
import joblib
from huggingface_hub import hf_hub_download
# Accompanying dataset is hosted in Hugging Face under 'brjapon/iris'
model_path = hf_hub_download(repo_id="brjapon/iris",
filename="iris_dt.joblib",
repo_type="model")
model = joblib.load(model_path)
# Example prediction (random values below)
sample_input = [[5.1, 3.5, 1.4, 0.2]]
prediction = model.predict(sample_input)
print(prediction) # e.g., [0] which might correspond to 'setosa'
Training Procedure
- Training Data: 80% of the 150-sample Iris dataset (120 samples).
- Validation Data: 20% (30 samples).
- Steps:
- Loaded dataset (obtained from HF repository
brjapon/iris
) - Split into training and test sets with
train_test_split
- Trained Decision Tree model with default settings
- Evaluated accuracy on the test set
- Loaded dataset (obtained from HF repository
Performance
Using a random 80/20 split, the model typically achieves ~97% accuracy on the test subset. Actual results may vary depending on your specific train/test split random state.
Limitations & Bias
- The Iris dataset is not representative of modern, large-scale classification tasks.
- Results should not be generalized beyond the included species and scenario.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
HF Inference deployability: The HF Inference API does not support tabular-classification models for scikit-learn
library.
Dataset used to train brjapon/iris-dt
Spaces using brjapon/iris-dt 2
Evaluation results
- Test Accuracyself-reported0.970