--- license: mit datasets: - custom metrics: - mean_squared_error - mean_absolute_error - r2_score model_name: Fertilizer Recommendation System tags: - random-forest - regression - multioutput - classification - agriculture - soil-nutrients --- # Fertilizer Application Recommendation System ## Overview This model predicts the fertilizer requirements for various crops based on input features such as crop type, target yield, field size, and soil properties. It utilizes a combination of Random Forest Regressor and Random Forest Classifier to predict both numerical values (e.g., nutrient needs) and categorical values (e.g., fertilizer application instructions). ## Training Data The model was trained on a custom dataset containing the following features: - Crop Name - Target Yield - Field Size - pH (water) - Organic Carbon - Total Nitrogen - Phosphorus (M3) - Potassium (exch.) - Soil moisture The target variables include: **Numerical Targets**: - Nitrogen (N) Need - Phosphorus (P2O5) Need - Potassium (K2O) Need - Organic Matter Need - Lime Need - Lime Application - Requirement - Organic Matter Application - Requirement - 1st Application - Requirement (1) - 1st Application - Requirement (2) - 2nd Application - Requirement (1) **Categorical Targets**: - Lime Application - Instruction - Lime Application - Organic Matter Application - Instruction - Organic Matter Application - 1st Application - 1st Application - Type fertilizer (1) - 1st Application - Type fertilizer (2) - 2nd Application - 2nd Application - Type fertilizer (1) ## Model Training The model was trained using the following steps: 1. **Data Preprocessing**: - Handling missing values - Scaling numerical features using `StandardScaler` - One-hot encoding categorical features 2. **Modeling**: - Splitting the dataset into training and testing sets - Training a `RandomForestRegressor` for numerical targets using a `MultiOutputRegressor` - Training a `RandomForestClassifier` for categorical targets using a `MultiOutputClassifier` 3. **Evaluation**: - Evaluating the models using the test set with metrics like Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R2) Score for regression, and accuracy for classification. ## Evaluation Metrics The model was evaluated using the following metrics: - Mean Squared Error (MSE) - Mean Absolute Error (MAE) - R-squared (R2) Score - Accuracy for categorical targets ## How to Use ### Input Format The model expects input data in JSON format with the following fields: - "Crop Name": String - "Target Yield": Numeric - "Field Size": Numeric - "pH (water)": Numeric - "Organic Carbon": Numeric - "Total Nitrogen": Numeric - "Phosphorus (M3)": Numeric - "Potassium (exch.)": Numeric - "Soil moisture": Numeric ### Preprocessing Steps This script includes: Loading the models and preprocessor. Defining the categorical and numerical targets. Loading the label encoders. Creating a function make_predictions that processes the input data, makes predictions, and decodes the categorical predictions. ### Inference Procedure ```python import pandas as pd from joblib import load from huggingface_hub import hf_hub_download from sklearn.preprocessing import LabelEncoder # Load models and preprocessor preprocessor_path = hf_hub_download(repo_id='Briankabiru/FertiliserApplication', filename='preprocessor.joblib') numerical_model_path = hf_hub_download(repo_id='Briankabiru/FertiliserApplication', filename='numerical_model.joblib') categorical_model_path = hf_hub_download(repo_id='Briankabiru/FertiliserApplication', filename='categorical_model.joblib') preprocessor = load(preprocessor_path) numerical_model = load(numerical_model_path) categorical_model = load(categorical_model_path) # Define categorical targets categorical_targets = [ 'Lime Application - Instruction', 'Lime Application', 'Organic Matter Application - Instruction', 'Organic Matter Application', '1st Application', '1st Application - Type fertilizer (1)', '1st Application - Type fertilizer (2)', '2nd Application', '2nd Application - Type fertilizer (1)', '1st Application_1', '1st Application - Type fertilizer (1)_3', '1st Application - Type fertilizer (2)_5', '2nd Application_6', '1st Application_21', '1st Application - Type fertilizer (1)_23', '1st Application - Type fertilizer (2)_25', '2nd Application_26', '2nd Application - Type fertilizer (1)_28' ] # Define numerical targets numerical_targets = [ 'Nitrogen (N) Need', 'Phosphorus (P2O5) Need', 'Potassium (K2O) Need', 'Organic Matter Need', 'Lime Need', 'Lime Application - Requirement', 'Organic Matter Application - Requirement', '1st Application - Requirement (1)', '1st Application - Requirement (2)', '2nd Application - Requirement (1)' ] # Load label encoders label_encoders = {col: load(hf_hub_download(repo_id='Briankabiru/FertiliserApplication', filename=f'label_encoder_{col}.joblib')) for col in categorical_targets} def make_predictions(input_data): # Convert input data to DataFrame input_df = pd.DataFrame([input_data]) # Preprocess the input data X_transformed = preprocessor.transform(input_df) # Predict with numerical model numerical_predictions = numerical_model.predict(X_transformed) # Predict with categorical model categorical_predictions_encoded = categorical_model.predict(X_transformed) # Decode categorical predictions categorical_predictions_decoded = {} for i, col in enumerate(categorical_targets): le = label_encoders[col] try: categorical_predictions_decoded[col] = le.inverse_transform(categorical_predictions_encoded[:, i]) except ValueError as e: categorical_predictions_decoded[col] = ["Unknown"] * len(categorical_predictions_encoded[:, i]) # Combine numerical and categorical predictions into a dictionary predictions_combined = {col: numerical_predictions[0, i] for i, col in enumerate(numerical_targets)} predictions_combined.update({col: categorical_predictions_decoded[col][0] for col in categorical_targets}) return predictions_combined # Example usage input_data = { 'Crop Name': 'maize(corn)', 'Target Yield': 3600.0, 'Field Size': 1.0, 'pH (water)': 6.1, 'Organic Carbon': 11.4, 'Total Nitrogen': 1.1, 'Phosphorus (M3)': 1.8, 'Potassium (exch.)': 3.0, 'Soil moisture': 20.0 } predictions = make_predictions(input_data) print("Predicted Fertilizer Requirements:") for col, pred_value in predictions.items(): print(f"{col}: {pred_value}")