Model Card for Product Return Prediction
model details
- person or organization developing model: team product-return-prediction
- model date: 24/11/2024
- model version: v1.4
- model type: Support Vector Machine
This model is a Support Vector Machine classifier designed to predict whether a product will be returned or not, based on various product and transaction features. Hyperparameters (C, kernel type and gamma) are chosen using a grid search, with a 10-fold cross validation.
intended use
primary intended uses
The purpose of the model is to assist e-commerce owners (Armani) in identifying possible returns among their purchases in order to reorganize inventories to optimize product handling and transportation costs
primary intended users
The model was developed for Armani. Specifically, the purpose is to support professional figures involved in logistics, product management, and marketing
factors
relevant factors
Some factors to be considered that involve the model are the following:
- product features: characteristics like model, fabric, colour, composition, and product category may have a significant impact on the likelihood of a product being returned
- imbalanced classes: the class imbalance is a relevant factor that may affect the model's ability to predict the minority class (returns) accurately
decision thresholds
The default decision threshold for the SVM model is 0.5, where probabilities greater than or equal to 0.5 indicate a "returned" prediction, and probabilities below 0.5 indicate "not returned."
Train and Test data
dataset description
- dataset: German Sales 2023 EA
the model was trained and tested on this dataset, following appropriate splitting and pre-processing steps.
split
Dataset splitting is as follows:
- training: 80%
- validation and test: 20%
the splitting is performed by using the corresponding sklearn function. The chosen random state is 42.
pre-processing
To be adapted to the binary classification task, and further adapted to a numerical model such as SVM, the model underwent an important pre-processing phase. Pre-processing steps are the following:
- Dataset conversion from Excel to TSV
- Specific columns removal from dataframe
- Train and test data splitting
- Train and save scaler
- Scaling data with a pre-trained scaler
- Target encoding of categorical columns
- Preparation of inventory with sales data
- Population of missing values
- Calculation and application of return percentages by color
- Final cleaning and processing
Quantitative analysis
PRECISION | RECALL | F1-SCORE | Support | |
---|---|---|---|---|
No return | 0.95 | 0.95 | 0.95 | 2086 |
Return | 0.89 | 0.90 | 0.89 | 960 |
Accuracy | 0.93 |