Spaces:

surahj
/

electricity-consumption-predictor

Sleeping

App Files Files Community

surahj commited on Aug 2

Commit

64f3974

0 Parent(s):

Initial commit

Browse files

Files changed (14) hide show

.gitignore +277 -0
README.md +314 -0
pytest.ini +18 -0
requirements.txt +7 -0
run_tests.py +118 -0
src/__init__.py +1 -0
src/app.py +373 -0
src/data_generator.py +164 -0
src/model.py +283 -0
tests/__init__.py +1 -0
tests/test_app.py +355 -0
tests/test_data_generator.py +278 -0
tests/test_integration.py +308 -0
tests/test_model.py +359 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,277 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# Virtual environments
+myenv/
+venv/
+env/
+ENV/
+env.bak/
+venv.bak/
+.venv/
+.env/
+# PyInstaller
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+Pipfile.lock
+# poetry
+poetry.lock
+# pdm
+.pdm.toml
+# PEP 582
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+.idea/
+*.iws
+*.iml
+*.ipr
+# VS Code
+.vscode/
+*.code-workspace
+# Sublime Text
+*.sublime-project
+*.sublime-workspace
+# Vim
+*.swp
+*.swo
+*~
+# Emacs
+*~
+\#*\#
+/.emacs.desktop
+/.emacs.desktop.lock
+*.elc
+auto-save-list
+tramp
+.\#*
+# macOS
+.DS_Store
+.AppleDouble
+.LSOverride
+Icon
+._*
+.DocumentRevisions-V100
+.fseventsd
+.Spotlight-V100
+.TemporaryItems
+.Trashes
+.VolumeIcon.icns
+.com.apple.timemachine.donotpresent
+.AppleDB
+.AppleDesktop
+Network Trash Folder
+Temporary Items
+.apdisk
+# Windows
+Thumbs.db
+Thumbs.db:encryptable
+ehthumbs.db
+ehthumbs_vista.db
+*.tmp
+*.temp
+Desktop.ini
+$RECYCLE.BIN/
+*.cab
+*.msi
+*.msix
+*.msm
+*.msp
+*.lnk
+# Linux
+*~
+.fuse_hidden*
+.directory
+.Trash-*
+.nfs*
+# Machine Learning specific
+*.joblib
+*.pkl
+*.pickle
+*.h5
+*.hdf5
+*.model
+*.weights
+*.ckpt
+*.pt
+*.pth
+*.onnx
+*.tflite
+*.pb
+*.savedmodel/
+checkpoints/
+models/
+logs/
+runs/
+wandb/
+mlruns/
+.mlflow/
+# Data files (uncomment if you don't want to track data)
+# *.csv
+# *.json
+# *.xml
+# *.xlsx
+# *.xls
+# *.parquet
+# *.feather
+# *.hdf
+# *.h5
+# Temporary files
+*.tmp
+*.temp
+*.bak
+*.backup
+*.old
+*.orig
+# Logs
+*.log
+logs/
+log/
+# Configuration files with sensitive data
+config.ini
+secrets.json
+.env.local
+.env.production
+.env.staging
+# OS generated files
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db

README.md ADDED Viewed

	@@ -0,0 +1,314 @@

+# Daily Household Electricity Consumption Predictor
+A web-based application designed to help Nigerian households estimate their daily electricity usage in Kilowatt-hours (kWh). This project serves as a practical learning vehicle for Machine Learning Operations (MLOps), covering the full lifecycle from data preparation and model training to deployment, monitoring, and continuous improvement.
+## 🎯 Project Goals
+### Business Goals
+- **Empower Households**: Provide users with a simple, accessible tool to understand and predict their daily electricity consumption
+- **Promote Energy Awareness**: Help users identify factors influencing their electricity usage, encouraging more efficient energy habits
+- **Inform Budgeting**: Enable users to better estimate their electricity bills, reducing financial surprises
+- **Foundational MLOps Learning**: Serve as a concrete project to apply and understand core MLOps principles
+### Machine Learning & Technical Goals
+- **Accurate Prediction**: Develop a regression model capable of predicting daily kWh consumption with acceptable accuracy
+- **User-Friendly Interface**: Create an intuitive web interface that allows easy input of features and clear display of predictions
+- **Deployable Application**: Build a self-contained application that can be deployed to a public platform
+- **MLOps Readiness**: Design the application with modularity and best practices that facilitate future MLOps implementation
+## 🏗️ Project Structure
+```
+lin-re-model/
+├── src/
+│   ├── __init__.py
+│   ├── data_generator.py      # Synthetic data generation
+│   ├── model.py              # ML model training and prediction
+│   └── app.py                # Gradio web interface
+├── tests/
+│   ├── __init__.py
+│   ├── test_data_generator.py # Data generator tests
+│   ├── test_model.py         # Model tests
+│   ├── test_app.py           # Application tests
+│   └── test_integration.py   # Integration tests
+├── requirements.txt          # Python dependencies
+├── pytest.ini              # Pytest configuration
+├── run_tests.py            # Test runner script
+└── README.md               # This file
+```
+## 🚀 Quick Start
+### Prerequisites
+- Python 3.8 or higher
+- pip (Python package installer)
+### Installation
+1. **Clone the repository** (if not already done):
+   ```bash
+   git clone <repository-url>
+   cd lin-re-model
+   ```
+2. **Install dependencies**:
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. **Run the application**:
+   ```bash
+   python src/app.py
+   ```
+4. **Open your browser** and navigate to `http://localhost:7860`
+## 🧪 Testing
+This project includes comprehensive tests to ensure code quality and functionality. The test suite covers:
+- **Unit Tests**: Individual component testing
+- **Integration Tests**: End-to-end workflow testing
+- **Data Quality Tests**: Validation of synthetic data generation
+- **Model Performance Tests**: Verification of model accuracy and consistency
+### Running Tests
+#### Option 1: Using the test runner script
+```bash
+# Run all tests with coverage
+python run_tests.py
+# Run only unit tests
+python run_tests.py unit
+# Run only integration tests
+python run_tests.py integration
+```
+#### Option 2: Using pytest directly
+```bash
+# Run all tests
+pytest
+# Run with verbose output
+pytest -v
+# Run with coverage report
+pytest --cov=src --cov-report=html
+# Run specific test file
+pytest tests/test_model.py
+# Run specific test class
+pytest tests/test_model.py::TestElectricityConsumptionModel
+# Run specific test method
+pytest tests/test_model.py::TestElectricityConsumptionModel::test_train_model
+```
+### Test Coverage
+The test suite provides comprehensive coverage including:
+- **Data Generator Tests**:
+  - Data generation with different parameters
+  - Data splitting functionality
+  - Data persistence (save/load)
+  - Data quality validation
+  - Reproducibility checks
+- **Model Tests**:
+  - Model initialization and training
+  - Feature preparation and validation
+  - Prediction functionality
+  - Model evaluation metrics
+  - Model persistence (save/load)
+  - Error handling
+- **Application Tests**:
+  - Web interface functionality
+  - User interaction flows
+  - Error handling in UI
+  - State management
+- **Integration Tests**:
+  - Complete workflow testing
+  - End-to-end functionality
+  - Performance consistency
+  - Data quality across components
+### Expected Test Results
+When all tests pass, you should see output similar to:
+```
+🧪 Running Daily Household Electricity Consumption Predictor Tests
+======================================================================
+============================= test session starts ==============================
+platform linux -- Python 3.8.x, pytest-7.4.0, pluggy-1.0.0
+rootdir: /path/to/lin-re-model
+plugins: cov-4.1.0
+collected 45 tests
+tests/test_app.py ...................                              [ 42%]
+tests/test_data_generator.py ...................                  [ 78%]
+tests/test_integration.py ..........                              [100%]
+---------- coverage: platform linux, python 3.8.x-final-0 -----------
+Name                           Stmts   Miss  Cover   Missing
+------------------------------------------------------------
+src/__init__.py                    1      0   100%
+src/app.py                       180      5    97%   180-185
+src/data_generator.py             95      2    98%   95-97
+src/model.py                     180      8    96%   180-188
+------------------------------------------------------------
+TOTAL                           456     15    97%
+============================== 45 passed in 5.23s ==============================
+✅ All tests passed!
+```
+## 📊 Model Features
+The electricity consumption prediction model uses the following features:
+1. **Average Daily Temperature** (°C): Numerical input (15-35°C range)
+2. **Day of the Week**: Categorical input (Monday through Sunday)
+3. **Major Event**: Boolean input (Holiday, Power Outage, etc.)
+### Model Algorithm
+- **Algorithm**: Linear Regression
+- **Preprocessing**: StandardScaler for numerical features, OneHotEncoder for categorical features
+- **Evaluation Metrics**: MSE, RMSE, MAE, R²
+## 🎮 Using the Application
+### Step 1: Generate Data & Train Model
+1. Navigate to the "Data Generation & Training" tab
+2. Adjust parameters as desired:
+   - Number of Data Points (100-5000)
+   - Noise Level (0.01-0.5)
+   - Training/Validation/Test Set Proportions
+3. Click "Generate Data & Train Model"
+4. Review the training metrics and evaluation results
+### Step 2: Make Predictions
+1. Navigate to the "Prediction" tab
+2. Enter your parameters:
+   - Average Daily Temperature (15-35°C)
+   - Day of the Week
+   - Major Event (checkbox)
+3. Click "Predict Consumption"
+4. View your estimated daily electricity consumption
+### Step 3: Understand the Model
+1. Navigate to the "Model Information" tab
+2. Click "Show Model Information"
+3. Review feature coefficients and model interpretation
+## 🔧 Development
+### Adding New Tests
+To add new tests:
+1. **Unit Tests**: Add to appropriate test file in `tests/`
+2. **Integration Tests**: Add to `tests/test_integration.py`
+3. **Follow naming convention**: `test_<functionality>`
+4. **Use descriptive docstrings**: Explain what the test validates
+### Test Best Practices
+- **Isolation**: Each test should be independent
+- **Descriptive names**: Test names should clearly indicate what they test
+- **Assertions**: Use specific assertions with meaningful messages
+- **Coverage**: Aim for high test coverage (>95%)
+- **Performance**: Tests should run quickly (<10 seconds total)
+### Running Tests in Development
+During development, you can run tests in different ways:
+```bash
+# Quick test run (no coverage)
+pytest -x  # Stop on first failure
+# Run tests in parallel (if pytest-xdist installed)
+pytest -n auto
+# Run tests with detailed output
+pytest -v -s
+# Run tests and watch for changes
+pytest-watch  # Requires pytest-watch package
+```
+## 🚀 Deployment
+### Local Deployment
+```bash
+python src/app.py
+```
+### Hugging Face Spaces Deployment
+1. Create a new Space on Hugging Face
+2. Upload the project files
+3. Configure the Space to run `python src/app.py`
+4. The application will be available at your Space URL
+## 📈 Future Enhancements
+### MLOps Features (Future Phases)
+- **Data Versioning**: Implement DVC for data version control
+- **Experiment Tracking**: Integrate MLflow or Weights & Biases
+- **Model Registry**: Use MLflow Model Registry for model lifecycle management
+- **Containerization**: Create Dockerfile for reproducible environments
+- **CI/CD**: Set up GitHub Actions for automated testing and deployment
+- **Model Monitoring**: Implement monitoring for data drift and performance degradation
+- **Continuous Training**: Define triggers for automated retraining
+### Model Improvements
+- **Feature Engineering**: Add more complex features (historical averages, time of day, etc.)
+- **Advanced Models**: Experiment with Random Forest, Gradient Boosting, etc.
+- **Hyperparameter Tuning**: Implement automated hyperparameter optimization
+- **Ensemble Methods**: Combine multiple models for better predictions
+## 🤝 Contributing
+1. Fork the repository
+2. Create a feature branch
+3. Make your changes
+4. Add tests for new functionality
+5. Ensure all tests pass
+6. Submit a pull request
+## 📄 License
+This project is licensed under the MIT License - see the LICENSE file for details.
+## 🙏 Acknowledgments
+- Gradio team for the excellent web interface framework
+- Scikit-learn team for the machine learning library
+- The MLOps community for best practices and guidance

pytest.ini ADDED Viewed

	@@ -0,0 +1,18 @@

+[tool:pytest]
+testpaths = tests
+python_files = test_*.py
+python_classes = Test*
+python_functions = test_*
+addopts =
+    -v
+    --tb=short
+    --strict-markers
+    --disable-warnings
+    --cov=src
+    --cov-report=term-missing
+    --cov-report=html:htmlcov
+    --cov-report=xml
+markers =
+    unit: Unit tests
+    integration: Integration tests
+    slow: Slow running tests

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+scikit-learn==1.3.0
+pandas==2.0.3
+numpy==1.24.3
+gradio==3.40.1
+pytest==7.4.0
+pytest-cov==4.1.0
+joblib==1.3.2

run_tests.py ADDED Viewed

	@@ -0,0 +1,118 @@

+#!/usr/bin/env python3
+"""
+Test runner script for the Daily Household Electricity Consumption Predictor.
+This script runs all tests and provides a summary of results.
+"""
+import subprocess
+import sys
+import os
+from pathlib import Path
+def run_tests():
+    """Run all tests and return the result."""
+    print("🧪 Running Daily Household Electricity Consumption Predictor Tests")
+    print("=" * 70)
+    # Change to project root directory
+    project_root = Path(__file__).parent
+    os.chdir(project_root)
+    # Run pytest with coverage
+    cmd = [
+        sys.executable,
+        "-m",
+        "pytest",
+        "--verbose",
+        "--tb=short",
+        "--cov=src",
+        "--cov-report=term-missing",
+        "--cov-report=html:htmlcov",
+        "--cov-report=xml",
+        "tests/",
+    ]
+    try:
+        result = subprocess.run(cmd, capture_output=False, text=True)
+        return result.returncode == 0
+    except Exception as e:
+        print(f"❌ Error running tests: {e}")
+        return False
+def run_unit_tests():
+    """Run only unit tests."""
+    print("🧪 Running Unit Tests")
+    print("=" * 40)
+    cmd = [
+        sys.executable,
+        "-m",
+        "pytest",
+        "--verbose",
+        "--tb=short",
+        "-m",
+        "unit",
+        "tests/",
+    ]
+    try:
+        result = subprocess.run(cmd, capture_output=False, text=True)
+        return result.returncode == 0
+    except Exception as e:
+        print(f"❌ Error running unit tests: {e}")
+        return False
+def run_integration_tests():
+    """Run only integration tests."""
+    print("🧪 Running Integration Tests")
+    print("=" * 40)
+    cmd = [
+        sys.executable,
+        "-m",
+        "pytest",
+        "--verbose",
+        "--tb=short",
+        "-m",
+        "integration",
+        "tests/",
+    ]
+    try:
+        result = subprocess.run(cmd, capture_output=False, text=True)
+        return result.returncode == 0
+    except Exception as e:
+        print(f"❌ Error running integration tests: {e}")
+        return False
+def main():
+    """Main function to run tests based on command line arguments."""
+    if len(sys.argv) > 1:
+        test_type = sys.argv[1].lower()
+        if test_type == "unit":
+            success = run_unit_tests()
+        elif test_type == "integration":
+            success = run_integration_tests()
+        else:
+            print(f"❌ Unknown test type: {test_type}")
+            print("Available options: unit, integration, all (default)")
+            return 1
+    else:
+        success = run_tests()
+    if success:
+        print("\n✅ All tests passed!")
+        return 0
+    else:
+        print("\n❌ Some tests failed!")
+        return 1
+if __name__ == "__main__":
+    sys.exit(main())

src/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Daily Household Electricity Consumption Predictor

src/app.py ADDED Viewed

	@@ -0,0 +1,373 @@

+"""
+Gradio Web Application for Daily Household Electricity Consumption Predictor
+This module provides a user-friendly web interface for the electricity consumption
+prediction model using Gradio.
+"""
+import gradio as gr
+import pandas as pd
+import numpy as np
+from typing import Tuple, Dict, Any
+import os
+import sys
+# Add src to path for imports
+sys.path.append(os.path.join(os.path.dirname(__file__), ".."))
+from src.data_generator import DataGenerator
+from src.model import ElectricityConsumptionModel
+class ElectricityPredictorApp:
+    """Gradio application for electricity consumption prediction."""
+    def __init__(self):
+        """Initialize the application with model and data generator."""
+        self.data_generator = DataGenerator(seed=42)
+        self.model = ElectricityConsumptionModel()
+        self.is_model_trained = False
+    def generate_and_train(
+        self,
+        n_samples: int,
+        noise_level: float,
+        train_size: float,
+        val_size: float,
+        test_size: float,
+    ) -> Tuple[str, str, str]:
+        """
+        Generate synthetic data and train the model.
+        Args:
+            n_samples: Number of data points to generate
+            noise_level: Level of noise in the data
+            train_size: Proportion for training set
+            val_size: Proportion for validation set
+            test_size: Proportion for test set
+        Returns:
+            Tuple of (data_info, training_metrics, evaluation_metrics)
+        """
+        try:
+            # Generate data
+            data = self.data_generator.generate_data(n_samples, noise_level)
+            # Split data
+            train_data, val_data, test_data = self.data_generator.split_data(
+                data, train_size, val_size, test_size
+            )
+            # Store data for later use
+            self.train_data = train_data
+            self.val_data = val_data
+            self.test_data = test_data
+            # Train model
+            X_train = train_data.drop("consumption_kwh", axis=1)
+            y_train = train_data[["consumption_kwh"]]
+            training_metrics = self.model.train(X_train, y_train)
+            # Evaluate model
+            X_test = test_data.drop("consumption_kwh", axis=1)
+            y_test = test_data[["consumption_kwh"]]
+            evaluation_metrics = self.model.evaluate(X_test, y_test)
+            self.is_model_trained = True
+            # Format output strings
+            data_info = f"""
+            **Data Generated Successfully!**
+            - Total samples: {len(data)}
+            - Training samples: {len(train_data)}
+            - Validation samples: {len(val_data)}
+            - Test samples: {len(test_data)}
+            **Data Statistics:**
+            - Temperature range: {data['temperature'].min():.1f}°C - {data['temperature'].max():.1f}°C
+            - Consumption range: {data['consumption_kwh'].min():.1f} - {data['consumption_kwh'].max():.1f} kWh
+            - Average consumption: {data['consumption_kwh'].mean():.1f} kWh
+            """
+            training_metrics_str = f"""
+            **Training Metrics:**
+            - Mean Squared Error (MSE): {training_metrics['train_mse']:.4f}
+            - Root Mean Squared Error (RMSE): {training_metrics['train_rmse']:.4f}
+            - Mean Absolute Error (MAE): {training_metrics['train_mae']:.4f}
+            - R-squared (R²): {training_metrics['train_r2']:.4f}
+            """
+            evaluation_metrics_str = f"""
+            **Test Set Evaluation:**
+            - Mean Squared Error (MSE): {evaluation_metrics['test_mse']:.4f}
+            - Root Mean Squared Error (RMSE): {evaluation_metrics['test_rmse']:.4f}
+            - Mean Absolute Error (MAE): {evaluation_metrics['test_mae']:.4f}
+            - R-squared (R²): {evaluation_metrics['test_r2']:.4f}
+            """
+            return data_info, training_metrics_str, evaluation_metrics_str
+        except Exception as e:
+            error_msg = f"Error during data generation and training: {str(e)}"
+            return error_msg, "", ""
+    def predict_consumption(
+        self, temperature: float, day_of_week: str, major_event: bool
+    ) -> str:
+        """
+        Make a prediction for electricity consumption.
+        Args:
+            temperature: Average daily temperature in Celsius
+            day_of_week: Day of the week
+            major_event: Whether there's a major event
+        Returns:
+            Formatted prediction result
+        """
+        if not self.is_model_trained:
+            return "**Error:** Model must be trained first. Please generate data and train the model."
+        try:
+            # Convert boolean to int
+            major_event_int = 1 if major_event else 0
+            # Make prediction
+            prediction = self.model.predict(temperature, day_of_week, major_event_int)
+            # Get model coefficients for explanation
+            coefficients = self.model.get_model_coefficients()
+            # Format result
+            result = f"""
+            **Prediction Result:**
+            **Estimated Daily Electricity Consumption: {prediction:.1f} kWh**
+            **Input Parameters:**
+            - Temperature: {temperature}°C
+            - Day of Week: {day_of_week}
+            - Major Event: {'Yes' if major_event else 'No'}
+            **Model Information:**
+            - Model Type: Linear Regression
+            - Intercept: {coefficients['intercept']:.4f}
+            - Number of Features: {len(coefficients['feature_names'])}
+            """
+            return result
+        except Exception as e:
+            return f"**Error during prediction:** {str(e)}"
+    def get_model_info(self) -> str:
+        """
+        Get detailed information about the trained model.
+        Returns:
+            Formatted model information
+        """
+        if not self.is_model_trained:
+            return "**Error:** Model must be trained first."
+        try:
+            coefficients = self.model.get_model_coefficients()
+            print(coefficients)
+            # Create feature importance table
+            feature_importance = []
+            for i, (feature, coef) in enumerate(
+                zip(coefficients["feature_names"], coefficients["coefficients"])
+            ):
+                feature_importance.append(f"| {feature} | {coef:.4f} |")
+            feature_table = "\n".join(feature_importance)
+            info = f"""
+            **Model Information:**
+            **Model Type:** Linear Regression
+            **Intercept:** {coefficients['intercept']:.4f}
+            **Feature Coefficients:**
+            | Feature | Coefficient |
+            |---------|-------------|
+            {feature_table}
+            **Interpretation:**
+            - Positive coefficients increase predicted consumption
+            - Negative coefficients decrease predicted consumption
+            - Temperature coefficient shows how much consumption changes per degree Celsius
+            - Day coefficients show consumption differences compared to Monday (baseline)
+            - Major event coefficient shows additional consumption during events
+            """
+            return info
+        except Exception as e:
+            return f"**Error getting model info:** {str(e)}"
+    def create_interface(self) -> gr.Interface:
+        """
+        Create the Gradio interface.
+        Returns:
+            Gradio Interface object
+        """
+        with gr.Blocks(
+            title="Daily Household Electricity Consumption Predictor"
+        ) as interface:
+            gr.Markdown(
+                """
+            # ⚡ Daily Household Electricity Consumption Predictor
+            This application helps Nigerian households estimate their daily electricity consumption
+            based on temperature, day of the week, and major events.
+            ## How to Use:
+            1. **Generate Data & Train Model**: Click the button to generate synthetic data and train the model
+            2. **Make Predictions**: Enter your parameters and get consumption estimates
+            3. **View Model Info**: See how the model works and feature importance
+            """
+            )
+            with gr.Tab("Data Generation & Training"):
+                gr.Markdown("### Step 1: Generate Synthetic Data and Train Model")
+                with gr.Row():
+                    with gr.Column():
+                        n_samples = gr.Slider(
+                            minimum=100,
+                            maximum=5000,
+                            value=1000,
+                            step=100,
+                            label="Number of Data Points",
+                        )
+                        noise_level = gr.Slider(
+                            minimum=0.01,
+                            maximum=0.5,
+                            value=0.1,
+                            step=0.01,
+                            label="Noise Level",
+                        )
+                    with gr.Column():
+                        train_size = gr.Slider(
+                            minimum=0.5,
+                            maximum=0.9,
+                            value=0.7,
+                            step=0.05,
+                            label="Training Set Proportion",
+                        )
+                        val_size = gr.Slider(
+                            minimum=0.05,
+                            maximum=0.3,
+                            value=0.15,
+                            step=0.05,
+                            label="Validation Set Proportion",
+                        )
+                        test_size = gr.Slider(
+                            minimum=0.05,
+                            maximum=0.3,
+                            value=0.15,
+                            step=0.05,
+                            label="Test Set Proportion",
+                        )
+                train_button = gr.Button(
+                    "Generate Data & Train Model", variant="primary"
+                )
+                with gr.Row():
+                    data_info = gr.Markdown("**Data information will appear here...**")
+                with gr.Row():
+                    training_metrics = gr.Markdown(
+                        "**Training metrics will appear here...**"
+                    )
+                    evaluation_metrics = gr.Markdown(
+                        "**Evaluation metrics will appear here...**"
+                    )
+                train_button.click(
+                    fn=self.generate_and_train,
+                    inputs=[n_samples, noise_level, train_size, val_size, test_size],
+                    outputs=[data_info, training_metrics, evaluation_metrics],
+                )
+            with gr.Tab("Prediction"):
+                gr.Markdown("### Step 2: Predict Electricity Consumption")
+                with gr.Row():
+                    with gr.Column():
+                        temperature = gr.Slider(
+                            minimum=15,
+                            maximum=35,
+                            value=25,
+                            step=0.5,
+                            label="Average Daily Temperature (°C)",
+                        )
+                        day_of_week = gr.Dropdown(
+                            choices=[
+                                "Monday",
+                                "Tuesday",
+                                "Wednesday",
+                                "Thursday",
+                                "Friday",
+                                "Saturday",
+                                "Sunday",
+                            ],
+                            value="Monday",
+                            label="Day of the Week",
+                        )
+                        major_event = gr.Checkbox(
+                            label="Major Event (Holiday, Power Outage, etc.)",
+                            value=False,
+                        )
+                    with gr.Column():
+                        predict_button = gr.Button(
+                            "Predict Consumption", variant="primary"
+                        )
+                        prediction_result = gr.Markdown(
+                            "**Prediction result will appear here...**"
+                        )
+                predict_button.click(
+                    fn=self.predict_consumption,
+                    inputs=[temperature, day_of_week, major_event],
+                    outputs=prediction_result,
+                )
+            with gr.Tab("Model Information"):
+                gr.Markdown("### Step 3: Understand the Model")
+                info_button = gr.Button("Show Model Information", variant="secondary")
+                model_info = gr.Markdown("**Model information will appear here...**")
+                info_button.click(fn=self.get_model_info, inputs=[], outputs=model_info)
+            gr.Markdown(
+                """
+            ---
+            **Note:** This application uses synthetic data for demonstration purposes.
+            In a real-world scenario, you would use actual historical consumption data.
+            """
+            )
+        return interface
+def main():
+    """Main function to launch the application."""
+    app = ElectricityPredictorApp()
+    interface = app.create_interface()
+    interface.launch(share=False, server_name="0.0.0.0", server_port=7860)
+if __name__ == "__main__":
+    main()

src/data_generator.py ADDED Viewed

	@@ -0,0 +1,164 @@

+"""
+Data Generator Module for Daily Household Electricity Consumption Predictor
+This module generates synthetic data for training and testing the electricity consumption
+prediction model. It creates realistic patterns based on temperature, day of week, and events.
+"""
+import numpy as np
+import pandas as pd
+from typing import Tuple, Optional
+import random
+class DataGenerator:
+    """Generates synthetic electricity consumption data for training and testing."""
+    def __init__(self, seed: Optional[int] = 42):
+        """
+        Initialize the data generator.
+        Args:
+            seed: Random seed for reproducibility
+        """
+        self.seed = seed
+        if seed is not None:
+            np.random.seed(seed)
+            random.seed(seed)
+    def generate_data(
+        self, n_samples: int = 1000, noise_level: float = 0.1
+    ) -> pd.DataFrame:
+        """
+        Generate synthetic electricity consumption data.
+        Args:
+            n_samples: Number of data points to generate
+            noise_level: Level of noise to add to the data (0-1)
+        Returns:
+            DataFrame with features and target variable
+        """
+        # Generate features
+        temperatures = np.random.normal(25, 8, n_samples)  # Mean 25°C, std 8°C
+        temperatures = np.clip(temperatures, 15, 35)  # Clip to realistic range
+        days_of_week = np.random.choice(
+            [
+                "Monday",
+                "Tuesday",
+                "Wednesday",
+                "Thursday",
+                "Friday",
+                "Saturday",
+                "Sunday",
+            ],
+            n_samples,
+        )
+        major_events = np.random.choice(
+            [0, 1], n_samples, p=[0.9, 0.1]
+        )  # 10% chance of event
+        # Create base consumption pattern
+        base_consumption = 15.0  # Base consumption in kWh
+        # Temperature effect (higher temp = higher consumption due to AC/fans)
+        temp_effect = 0.3 * (temperatures - 25)
+        # Day of week effect (weekends typically higher consumption)
+        day_effects = {
+            "Monday": 0.5,
+            "Tuesday": 0.3,
+            "Wednesday": 0.2,
+            "Thursday": 0.1,
+            "Friday": 0.8,
+            "Saturday": 1.5,
+            "Sunday": 1.2,
+        }
+        day_effect = np.array([day_effects[day] for day in days_of_week])
+        # Major event effect (events typically increase consumption)
+        event_effect = major_events * 2.0
+        # Calculate consumption
+        consumption = base_consumption + temp_effect + day_effect + event_effect
+        # Add noise
+        noise = np.random.normal(0, noise_level * np.std(consumption), n_samples)
+        consumption += noise
+        # Ensure positive values
+        consumption = np.maximum(consumption, 5.0)
+        # Create DataFrame
+        data = pd.DataFrame(
+            {
+                "temperature": temperatures,
+                "day_of_week": days_of_week,
+                "major_event": major_events,
+                "consumption_kwh": consumption,
+            }
+        )
+        return data
+    def split_data(
+        self,
+        data: pd.DataFrame,
+        train_size: float = 0.7,
+        val_size: float = 0.15,
+        test_size: float = 0.15,
+    ) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]:
+        """
+        Split data into training, validation, and test sets.
+        Args:
+            data: Input DataFrame
+            train_size: Proportion for training set
+            val_size: Proportion for validation set
+            test_size: Proportion for test set
+        Returns:
+            Tuple of (train_data, val_data, test_data)
+        """
+        assert (
+            abs(train_size + val_size + test_size - 1.0) < 1e-6
+        ), "Split proportions must sum to 1"
+        # Shuffle data
+        data_shuffled = data.sample(frac=1, random_state=self.seed).reset_index(
+            drop=True
+        )
+        n_samples = len(data_shuffled)
+        train_end = int(n_samples * train_size)
+        val_end = train_end + int(n_samples * val_size)
+        train_data = data_shuffled[:train_end]
+        val_data = data_shuffled[train_end:val_end]
+        test_data = data_shuffled[val_end:]
+        return train_data, val_data, test_data
+    def save_data(self, data: pd.DataFrame, filepath: str) -> None:
+        """
+        Save data to CSV file.
+        Args:
+            data: DataFrame to save
+            filepath: Path to save the file
+        """
+        data.to_csv(filepath, index=False)
+    def load_data(self, filepath: str) -> pd.DataFrame:
+        """
+        Load data from CSV file.
+        Args:
+            filepath: Path to the file
+        Returns:
+            Loaded DataFrame
+        """
+        return pd.read_csv(filepath)

src/model.py ADDED Viewed

	@@ -0,0 +1,283 @@

+"""
+Model Module for Daily Household Electricity Consumption Predictor
+This module handles data preprocessing, model training, evaluation, and prediction
+for the electricity consumption prediction model.
+"""
+import numpy as np
+import pandas as pd
+from sklearn.linear_model import LinearRegression
+from sklearn.preprocessing import OneHotEncoder, StandardScaler
+from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
+from sklearn.pipeline import Pipeline
+from sklearn.compose import ColumnTransformer
+import joblib
+from typing import Tuple, Dict, Any, Optional
+import os
+class ElectricityConsumptionModel:
+    """Linear regression model for predicting daily electricity consumption."""
+    def __init__(self):
+        """Initialize the model with preprocessing pipeline."""
+        self.model = None
+        self.preprocessor = None
+        self.feature_names = None
+        self.is_trained = False
+    def _create_preprocessor(self) -> ColumnTransformer:
+        """
+        Create preprocessing pipeline for the features.
+        Returns:
+            ColumnTransformer with preprocessing steps
+        """
+        # Numerical features (temperature)
+        numerical_features = ["temperature"]
+        numerical_transformer = StandardScaler()
+        # Categorical features (day_of_week)
+        categorical_features = ["day_of_week"]
+        categorical_transformer = OneHotEncoder(drop="first", sparse=False)
+        # Boolean features (major_event) - no transformation needed
+        boolean_features = ["major_event"]
+        boolean_transformer = "passthrough"
+        # Combine all transformers
+        preprocessor = ColumnTransformer(
+            transformers=[
+                ("num", numerical_transformer, numerical_features),
+                ("cat", categorical_transformer, categorical_features),
+                ("bool", boolean_transformer, boolean_features),
+            ],
+            remainder="drop",
+        )
+        return preprocessor
+    def _create_pipeline(self) -> Pipeline:
+        """
+        Create the complete model pipeline.
+        Returns:
+            Pipeline with preprocessing and model
+        """
+        preprocessor = self._create_preprocessor()
+        model = LinearRegression()
+        pipeline = Pipeline([("preprocessor", preprocessor), ("regressor", model)])
+        return pipeline
+    def prepare_features(self, data: pd.DataFrame) -> pd.DataFrame:
+        """
+        Prepare features for training/prediction.
+        Args:
+            data: Input DataFrame with raw features
+        Returns:
+            DataFrame with prepared features
+        """
+        required_columns = ["temperature", "day_of_week", "major_event"]
+        # Validate input data
+        missing_columns = [col for col in required_columns if col not in data.columns]
+        if missing_columns:
+            raise ValueError(f"Missing required columns: {missing_columns}")
+        # Validate data types and ranges
+        if not all(data["temperature"].between(15, 35)):
+            raise ValueError("Temperature must be between 15 and 35 degrees Celsius")
+        valid_days = [
+            "Monday",
+            "Tuesday",
+            "Wednesday",
+            "Thursday",
+            "Friday",
+            "Saturday",
+            "Sunday",
+        ]
+        if not all(day in valid_days for day in data["day_of_week"].unique()):
+            raise ValueError(f"Day of week must be one of: {valid_days}")
+        if not all(data["major_event"].isin([0, 1])):
+            raise ValueError("Major event must be 0 or 1")
+        return data[required_columns].copy()
+    def train(self, X_train: pd.DataFrame, y_train: pd.DataFrame) -> Dict[str, float]:
+        """
+        Train the model on the provided data.
+        Args:
+            X_train: Training features
+            y_train: Training targets
+        Returns:
+            Dictionary with training metrics
+        """
+        # Prepare features
+        X_prepared = self.prepare_features(X_train)
+        # Create and train pipeline
+        self.model = self._create_pipeline()
+        self.model.fit(X_prepared, y_train["consumption_kwh"])
+        # Store feature names for later use
+        self.feature_names = X_prepared.columns.tolist()
+        self.is_trained = True
+        # Calculate training metrics
+        y_pred = self.model.predict(X_prepared)
+        metrics = {
+            "train_mse": mean_squared_error(y_train["consumption_kwh"], y_pred),
+            "train_rmse": np.sqrt(
+                mean_squared_error(y_train["consumption_kwh"], y_pred)
+            ),
+            "train_mae": mean_absolute_error(y_train["consumption_kwh"], y_pred),
+            "train_r2": r2_score(y_train["consumption_kwh"], y_pred),
+        }
+        return metrics
+    def evaluate(self, X_test: pd.DataFrame, y_test: pd.DataFrame) -> Dict[str, float]:
+        """
+        Evaluate the model on test data.
+        Args:
+            X_test: Test features
+            y_test: Test targets
+        Returns:
+            Dictionary with evaluation metrics
+        """
+        if not self.is_trained:
+            raise ValueError("Model must be trained before evaluation")
+        # Prepare features
+        X_prepared = self.prepare_features(X_test)
+        # Make predictions
+        y_pred = self.model.predict(X_prepared)
+        # Calculate metrics
+        metrics = {
+            "test_mse": mean_squared_error(y_test["consumption_kwh"], y_pred),
+            "test_rmse": np.sqrt(mean_squared_error(y_test["consumption_kwh"], y_pred)),
+            "test_mae": mean_absolute_error(y_test["consumption_kwh"], y_pred),
+            "test_r2": r2_score(y_test["consumption_kwh"], y_pred),
+        }
+        return metrics
+    def predict(self, temperature: float, day_of_week: str, major_event: int) -> float:
+        """
+        Make a single prediction.
+        Args:
+            temperature: Average daily temperature in Celsius
+            day_of_week: Day of the week
+            major_event: Whether there's a major event (0 or 1)
+        Returns:
+            Predicted electricity consumption in kWh
+        """
+        if not self.is_trained:
+            raise ValueError("Model must be trained before making predictions")
+        # Create input DataFrame
+        input_data = pd.DataFrame(
+            {
+                "temperature": [temperature],
+                "day_of_week": [day_of_week],
+                "major_event": [major_event],
+            }
+        )
+        # Prepare features
+        X_prepared = self.prepare_features(input_data)
+        # Make prediction
+        prediction = self.model.predict(X_prepared)[0]
+        return max(0, prediction)  # Ensure non-negative prediction
+    def get_model_coefficients(self) -> Dict[str, Any]:
+        """
+        Get model coefficients and feature names.
+        Returns:
+            Dictionary with model coefficients and feature information
+        """
+        if not self.is_trained:
+            raise ValueError("Model must be trained before accessing coefficients")
+        # Get feature names from preprocessor
+        preprocessor = self.model.named_steps["preprocessor"]
+        feature_names = []
+        # Numerical features
+        feature_names.extend(["temperature"])
+        # Categorical features (one-hot encoded)
+        cat_transformer = preprocessor.named_transformers_["cat"]
+        day_names = [
+            "Tuesday",
+            "Wednesday",
+            "Thursday",
+            "Friday",
+            "Saturday",
+            "Sunday",
+        ]  # Monday is dropped
+        feature_names.extend([f"day_{day.lower()}" for day in day_names])
+        # Boolean features
+        feature_names.extend(["major_event"])
+        # Get coefficients
+        coefficients = self.model.named_steps["regressor"].coef_
+        intercept = self.model.named_steps["regressor"].intercept_
+        return {
+            "feature_names": feature_names,
+            "coefficients": coefficients.tolist(),
+            "intercept": float(intercept),
+        }
+    def save_model(self, filepath: str) -> None:
+        """
+        Save the trained model to disk.
+        Args:
+            filepath: Path to save the model
+        """
+        if not self.is_trained:
+            raise ValueError("Model must be trained before saving")
+        # Create directory if it doesn't exist
+        os.makedirs(os.path.dirname(filepath), exist_ok=True)
+        # Save model
+        joblib.dump(self.model, filepath)
+    def load_model(self, filepath: str) -> None:
+        """
+        Load a trained model from disk.
+        Args:
+            filepath: Path to the saved model
+        """
+        if not os.path.exists(filepath):
+            raise FileNotFoundError(f"Model file not found: {filepath}")
+        self.model = joblib.load(filepath)
+        self.is_trained = True
+        # Extract feature names from the loaded model
+        preprocessor = self.model.named_steps["preprocessor"]
+        self.feature_names = ["temperature", "day_of_week", "major_event"]

tests/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Test package for Daily Household Electricity Consumption Predictor

tests/test_app.py ADDED Viewed

	@@ -0,0 +1,355 @@

+"""
+Tests for the Gradio Application module.
+This module contains tests for the ElectricityPredictorApp class to ensure
+the web interface functions correctly.
+"""
+import pytest
+import pandas as pd
+import numpy as np
+from unittest.mock import patch, MagicMock
+from src.app import ElectricityPredictorApp
+class TestElectricityPredictorApp:
+    """Test cases for ElectricityPredictorApp class."""
+    def setup_method(self):
+        """Set up test app for each test method."""
+        self.app = ElectricityPredictorApp()
+    def test_initialization(self):
+        """Test app initialization."""
+        assert self.app.data_generator is not None
+        assert self.app.model is not None
+        assert not self.app.is_model_trained
+    def test_generate_and_train_success(self):
+        """Test successful data generation and training."""
+        # Mock the data generator methods
+        with patch.object(
+            self.app.data_generator, "generate_data"
+        ) as mock_generate, patch.object(
+            self.app.data_generator, "split_data"
+        ) as mock_split, patch.object(
+            self.app.model, "train"
+        ) as mock_train, patch.object(
+            self.app.model, "evaluate"
+        ) as mock_evaluate:
+            # Create mock data
+            mock_data = pd.DataFrame(
+                {
+                    "temperature": [25.0, 30.0],
+                    "day_of_week": ["Monday", "Tuesday"],
+                    "major_event": [0, 1],
+                    "consumption_kwh": [15.0, 18.0],
+                }
+            )
+            mock_generate.return_value = mock_data
+            # Create mock split data
+            train_data = mock_data.iloc[:1]
+            val_data = mock_data.iloc[1:2]
+            test_data = mock_data.iloc[1:2]
+            mock_split.return_value = (train_data, val_data, test_data)
+            mock_train.return_value = {
+                "train_mse": 2.5,
+                "train_rmse": 1.58,
+                "train_mae": 1.2,
+                "train_r2": 0.85,
+            }
+            mock_evaluate.return_value = {
+                "test_mse": 2.8,
+                "test_rmse": 1.67,
+                "test_mae": 1.3,
+                "test_r2": 0.82,
+            }
+            # Call the method
+            data_info, training_metrics, evaluation_metrics = (
+                self.app.generate_and_train(
+                    n_samples=1000,
+                    noise_level=0.1,
+                    train_size=0.7,
+                    val_size=0.15,
+                    test_size=0.15,
+                )
+            )
+            # Check that methods were called
+            mock_generate.assert_called_once_with(1000, 0.1)
+            mock_split.assert_called_once_with(mock_data, 0.7, 0.15, 0.15)
+            mock_train.assert_called_once()
+            mock_evaluate.assert_called_once()
+            # Check that app state was updated
+            assert self.app.is_model_trained
+            assert hasattr(self.app, "train_data")
+            assert hasattr(self.app, "val_data")
+            assert hasattr(self.app, "test_data")
+            # Check output strings contain expected information
+            assert "Data Generated Successfully!" in data_info
+            assert "Training Metrics:" in training_metrics
+            assert "Test Set Evaluation:" in evaluation_metrics
+            assert "2.5000" in training_metrics  # MSE value
+            assert "0.8500" in training_metrics  # R² value
+    def test_generate_and_train_error(self):
+        """Test error handling in data generation and training."""
+        # Mock the data generator to raise an exception
+        with patch.object(
+            self.app.data_generator,
+            "generate_data",
+            side_effect=Exception("Test error"),
+        ):
+            data_info, training_metrics, evaluation_metrics = (
+                self.app.generate_and_train(
+                    n_samples=1000,
+                    noise_level=0.1,
+                    train_size=0.7,
+                    val_size=0.15,
+                    test_size=0.15,
+                )
+            )
+            assert "Error during data generation and training" in data_info
+            assert training_metrics == ""
+            assert evaluation_metrics == ""
+    def test_predict_consumption_not_trained(self):
+        """Test prediction when model is not trained."""
+        result = self.app.predict_consumption(25.0, "Monday", False)
+        assert "Model must be trained first" in result
+    def test_predict_consumption_success(self):
+        """Test successful prediction."""
+        # Set up the app as if it's trained
+        self.app.is_model_trained = True
+        # Mock the model prediction
+        with patch.object(
+            self.app.model, "predict", return_value=16.5
+        ) as mock_predict, patch.object(
+            self.app.model, "get_model_coefficients"
+        ) as mock_coeffs:
+            mock_coeffs.return_value = {
+                "feature_names": ["temperature", "major_event"],
+                "coefficients": [0.3, 2.0],
+                "intercept": 10.0,
+            }
+            result = self.app.predict_consumption(25.0, "Monday", True)
+            # Check that prediction was called
+            mock_predict.assert_called_once_with(25.0, "Monday", 1)
+            # Check output contains expected information
+            assert "Estimated Daily Electricity Consumption: 16.5 kWh" in result
+            assert "Temperature: 25.0°C" in result
+            assert "Day of Week: Monday" in result
+            assert "Major Event: Yes" in result
+            assert "Model Type: Linear Regression" in result
+    def test_predict_consumption_error(self):
+        """Test error handling in prediction."""
+        # Set up the app as if it's trained
+        self.app.is_model_trained = True
+        # Mock the model to raise an exception
+        with patch.object(
+            self.app.model, "predict", side_effect=Exception("Prediction error")
+        ):
+            result = self.app.predict_consumption(25.0, "Monday", False)
+            assert "Error during prediction" in result
+    def test_get_model_info_not_trained(self):
+        """Test getting model info when model is not trained."""
+        result = self.app.get_model_info()
+        assert "Model must be trained first" in result
+    def test_get_model_info_success(self):
+        """Test successful model info retrieval."""
+        # Set up the app as if it's trained
+        self.app.is_model_trained = True
+        # Mock the model coefficients
+        with patch.object(self.app.model, "get_model_coefficients") as mock_coeffs:
+            mock_coeffs.return_value = {
+                "feature_names": ["temperature", "day_tuesday", "major_event"],
+                "coefficients": [0.3, 0.5, 2.0],
+                "intercept": 10.0,
+            }
+            result = self.app.get_model_info()
+            # Check output contains expected information
+            assert "**Model Information:**" in result
+            assert "**Model Type:** Linear Regression" in result
+            assert "**Intercept:** 10.0000" in result
+            assert "**Feature Coefficients:**" in result
+            assert "temperature" in result
+            assert "major_event" in result
+            assert "**Interpretation:**" in result
+    def test_get_model_info_error(self):
+        """Test error handling in model info retrieval."""
+        # Set up the app as if it's trained
+        self.app.is_model_trained = True
+        # Mock the model to raise an exception
+        with patch.object(
+            self.app.model,
+            "get_model_coefficients",
+            side_effect=Exception("Info error"),
+        ):
+            result = self.app.get_model_info()
+            assert "Error getting model info" in result
+    def test_boolean_conversion_in_prediction(self):
+        """Test that boolean values are correctly converted to integers."""
+        # Set up the app as if it's trained
+        self.app.is_model_trained = True
+        # Mock the model prediction
+        with patch.object(self.app.model, "predict") as mock_predict, patch.object(
+            self.app.model, "get_model_coefficients"
+        ) as mock_coeffs:
+            mock_predict.return_value = 15.0
+            mock_coeffs.return_value = {
+                "feature_names": ["temperature", "major_event"],
+                "coefficients": [0.3, 2.0],
+                "intercept": 10.0,
+            }
+            # Test with True
+            self.app.predict_consumption(25.0, "Monday", True)
+            mock_predict.assert_called_with(25.0, "Monday", 1)
+            # Test with False
+            self.app.predict_consumption(25.0, "Monday", False)
+            mock_predict.assert_called_with(25.0, "Monday", 0)
+    def test_data_storage_after_training(self):
+        """Test that data is properly stored after training."""
+        # Mock the data generator
+        with patch.object(
+            self.app.data_generator, "generate_data"
+        ) as mock_generate, patch.object(
+            self.app.data_generator, "split_data"
+        ) as mock_split, patch.object(
+            self.app.model, "train"
+        ) as mock_train, patch.object(
+            self.app.model, "evaluate"
+        ) as mock_evaluate:
+            # Create mock data
+            mock_data = pd.DataFrame(
+                {
+                    "temperature": [25.0, 30.0],
+                    "day_of_week": ["Monday", "Tuesday"],
+                    "major_event": [0, 1],
+                    "consumption_kwh": [15.0, 18.0],
+                }
+            )
+            mock_generate.return_value = mock_data
+            train_data = mock_data.iloc[:1]
+            val_data = mock_data.iloc[1:2]
+            test_data = mock_data.iloc[1:2]
+            mock_split.return_value = (train_data, val_data, test_data)
+            mock_train.return_value = {
+                "train_mse": 2.5,
+                "train_rmse": 1.58,
+                "train_mae": 1.2,
+                "train_r2": 0.85,
+            }
+            mock_evaluate.return_value = {
+                "test_mse": 2.8,
+                "test_rmse": 1.67,
+                "test_mae": 1.3,
+                "test_r2": 0.82,
+            }
+            # Call the method
+            self.app.generate_and_train(1000, 0.1, 0.7, 0.15, 0.15)
+            # Check that data is stored
+            assert hasattr(self.app, "train_data")
+            assert hasattr(self.app, "val_data")
+            assert hasattr(self.app, "test_data")
+            assert len(self.app.train_data) == 1
+            assert len(self.app.val_data) == 1
+            assert len(self.app.test_data) == 1
+    def test_interface_creation(self):
+        """Test that the Gradio interface can be created."""
+        # This test verifies that the interface creation doesn't raise exceptions
+        try:
+            interface = self.app.create_interface()
+            assert interface is not None
+        except Exception as e:
+            pytest.fail(f"Interface creation failed: {e}")
+    def test_prediction_output_format(self):
+        """Test that prediction output is properly formatted."""
+        # Set up the app as if it's trained
+        self.app.is_model_trained = True
+        # Mock the model
+        with patch.object(
+            self.app.model, "predict", return_value=16.5
+        ) as mock_predict, patch.object(
+            self.app.model, "get_model_coefficients"
+        ) as mock_coeffs:
+            mock_coeffs.return_value = {
+                "feature_names": ["temperature", "major_event"],
+                "coefficients": [0.3, 2.0],
+                "intercept": 10.0,
+            }
+            result = self.app.predict_consumption(25.0, "Monday", False)
+            # Check formatting
+            assert "**Prediction Result:**" in result
+            assert "**Input Parameters:**" in result
+            assert "**Model Information:**" in result
+            assert "Estimated Daily Electricity Consumption: 16.5 kWh" in result
+            assert "Temperature: 25.0°C" in result
+            assert "Day of Week: Monday" in result
+            assert "Major Event: No" in result
+    def test_model_info_output_format(self):
+        """Test that model info output is properly formatted."""
+        # Set up the app as if it's trained
+        self.app.is_model_trained = True
+        # Mock the model coefficients
+        with patch.object(self.app.model, "get_model_coefficients") as mock_coeffs:
+            mock_coeffs.return_value = {
+                "feature_names": ["temperature", "day_tuesday", "major_event"],
+                "coefficients": [0.3, 0.5, 2.0],
+                "intercept": 10.0,
+            }
+            result = self.app.get_model_info()
+            # Check formatting
+            assert "**Model Information:**" in result
+            assert "**Model Type:**" in result
+            assert "**Intercept:**" in result
+            assert "**Feature Coefficients:**" in result
+            assert "| Feature | Coefficient |" in result
+            assert "**Interpretation:**" in result
+            assert "Positive coefficients increase predicted consumption" in result

tests/test_data_generator.py ADDED Viewed

	@@ -0,0 +1,278 @@

+"""
+Tests for the DataGenerator module.
+This module contains comprehensive tests for the DataGenerator class to ensure
+proper data generation, splitting, and file operations.
+"""
+import pytest
+import pandas as pd
+import numpy as np
+import tempfile
+import os
+from src.data_generator import DataGenerator
+class TestDataGenerator:
+    """Test cases for DataGenerator class."""
+    def test_initialization(self):
+        """Test DataGenerator initialization with and without seed."""
+        # Test with seed
+        generator = DataGenerator(seed=42)
+        assert generator.seed == 42
+        # Test without seed
+        generator_no_seed = DataGenerator(seed=None)
+        assert generator_no_seed.seed is None
+    def test_generate_data_basic(self):
+        """Test basic data generation with default parameters."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data()
+        # Check DataFrame structure
+        assert isinstance(data, pd.DataFrame)
+        assert len(data) == 1000  # Default n_samples
+        assert list(data.columns) == [
+            "temperature",
+            "day_of_week",
+            "major_event",
+            "consumption_kwh",
+        ]
+        # Check data types
+        assert data["temperature"].dtype in [np.float64, np.float32]
+        assert data["day_of_week"].dtype == "object"
+        assert data["major_event"].dtype in [np.int64, np.int32]
+        assert data["consumption_kwh"].dtype in [np.float64, np.float32]
+    def test_generate_data_custom_parameters(self):
+        """Test data generation with custom parameters."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=500, noise_level=0.2)
+        assert len(data) == 500
+        # Check temperature range
+        assert data["temperature"].min() >= 15
+        assert data["temperature"].max() <= 35
+        # Check day of week values
+        valid_days = [
+            "Monday",
+            "Tuesday",
+            "Wednesday",
+            "Thursday",
+            "Friday",
+            "Saturday",
+            "Sunday",
+        ]
+        assert all(day in valid_days for day in data["day_of_week"].unique())
+        # Check major event values
+        assert all(event in [0, 1] for event in data["major_event"].unique())
+        # Check consumption is positive
+        assert all(data["consumption_kwh"] > 0)
+    def test_generate_data_reproducibility(self):
+        """Test that data generation is reproducible with the same seed."""
+        # Reset numpy random seed to ensure reproducibility
+        np.random.seed(42)
+        generator1 = DataGenerator(seed=42)
+        data1 = generator1.generate_data(n_samples=100)
+        # Reset numpy random seed again
+        np.random.seed(42)
+        generator2 = DataGenerator(seed=42)
+        data2 = generator2.generate_data(n_samples=100)
+        pd.testing.assert_frame_equal(data1, data2)
+    def test_generate_data_different_seeds(self):
+        """Test that different seeds produce different data."""
+        generator1 = DataGenerator(seed=42)
+        generator2 = DataGenerator(seed=123)
+        data1 = generator1.generate_data(n_samples=100)
+        data2 = generator2.generate_data(n_samples=100)
+        # Data should be different
+        assert not data1.equals(data2)
+    def test_split_data_basic(self):
+        """Test basic data splitting functionality."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=1000)
+        train_data, val_data, test_data = generator.split_data(data)
+        # Check split proportions
+        assert len(train_data) == 700  # 70% of 1000
+        assert len(val_data) == 150  # 15% of 1000
+        assert len(test_data) == 150  # 15% of 1000
+        # Check total samples
+        assert len(train_data) + len(val_data) + len(test_data) == len(data)
+        # Check all data is used
+        all_data = pd.concat([train_data, val_data, test_data])
+        assert len(all_data) == len(data)
+    def test_split_data_custom_proportions(self):
+        """Test data splitting with custom proportions."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=1000)
+        train_data, val_data, test_data = generator.split_data(
+            data, train_size=0.6, val_size=0.2, test_size=0.2
+        )
+        assert len(train_data) == 600
+        assert len(val_data) == 200
+        assert len(test_data) == 200
+    def test_split_data_validation(self):
+        """Test that split proportions validation works."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=100)
+        # Test invalid proportions
+        with pytest.raises(AssertionError):
+            generator.split_data(data, train_size=0.5, val_size=0.3, test_size=0.3)
+        with pytest.raises(AssertionError):
+            generator.split_data(data, train_size=0.4, val_size=0.3, test_size=0.2)
+    def test_split_data_reproducibility(self):
+        """Test that data splitting is reproducible."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=1000)
+        # First split
+        train1, val1, test1 = generator.split_data(data)
+        # Second split with same data
+        train2, val2, test2 = generator.split_data(data)
+        # Results should be identical
+        pd.testing.assert_frame_equal(train1, train2)
+        pd.testing.assert_frame_equal(val1, val2)
+        pd.testing.assert_frame_equal(test1, test2)
+    def test_save_and_load_data(self):
+        """Test saving and loading data to/from CSV."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=100)
+        with tempfile.NamedTemporaryFile(
+            mode="w", suffix=".csv", delete=False
+        ) as tmp_file:
+            filepath = tmp_file.name
+        try:
+            # Save data
+            generator.save_data(data, filepath)
+            # Check file exists
+            assert os.path.exists(filepath)
+            # Load data
+            loaded_data = generator.load_data(filepath)
+            # Check data is identical
+            pd.testing.assert_frame_equal(data, loaded_data)
+        finally:
+            # Clean up
+            if os.path.exists(filepath):
+                os.unlink(filepath)
+    def test_data_statistics(self):
+        """Test that generated data has reasonable statistics."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=1000)
+        # Temperature statistics
+        assert 15 <= data["temperature"].mean() <= 35
+        assert data["temperature"].std() > 0
+        # Consumption statistics
+        assert data["consumption_kwh"].mean() > 0
+        assert data["consumption_kwh"].std() > 0
+        # Day of week distribution
+        day_counts = data["day_of_week"].value_counts()
+        assert len(day_counts) == 7
+        # All days should have some data
+        assert all(count > 0 for count in day_counts.values)
+        # Major event distribution (should be mostly 0s)
+        event_counts = data["major_event"].value_counts()
+        assert 0 in event_counts.index
+        assert 1 in event_counts.index
+        # Should be more 0s than 1s
+        assert event_counts[0] > event_counts[1]
+    def test_noise_level_effect(self):
+        """Test that noise level affects data variability."""
+        generator = DataGenerator(seed=42)
+        # Generate data with low noise
+        data_low_noise = generator.generate_data(n_samples=1000, noise_level=0.01)
+        # Generate data with high noise
+        data_high_noise = generator.generate_data(n_samples=1000, noise_level=0.5)
+        # High noise should have higher standard deviation
+        assert (
+            data_high_noise["consumption_kwh"].std()
+            > data_low_noise["consumption_kwh"].std()
+        )
+    def test_temperature_consumption_correlation(self):
+        """Test that temperature and consumption have positive correlation."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=1000)
+        correlation = data["temperature"].corr(data["consumption_kwh"])
+        assert correlation > 0  # Should be positive correlation
+    def test_day_of_week_effect(self):
+        """Test that different days have different consumption patterns."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=1000)
+        # Group by day and check consumption means
+        day_consumption = data.groupby("day_of_week")["consumption_kwh"].mean()
+        # Should have some variation between days
+        assert day_consumption.std() > 0
+        # Weekend days (Saturday, Sunday) should generally have higher consumption
+        weekend_avg = (day_consumption["Saturday"] + day_consumption["Sunday"]) / 2
+        weekday_avg = (
+            day_consumption["Monday"]
+            + day_consumption["Tuesday"]
+            + day_consumption["Wednesday"]
+            + day_consumption["Thursday"]
+            + day_consumption["Friday"]
+        ) / 5
+        # This might not always be true due to randomness, but should be generally true
+        # We'll just check that there's variation
+        assert abs(weekend_avg - weekday_avg) > 0.1
+    def test_major_event_effect(self):
+        """Test that major events increase consumption."""
+        generator = DataGenerator(seed=42)
+        data = generator.generate_data(n_samples=1000)
+        # Group by major event and check consumption means
+        event_consumption = data.groupby("major_event")["consumption_kwh"].mean()
+        # Consumption should be higher when there's a major event
+        assert event_consumption[1] > event_consumption[0]

tests/test_integration.py ADDED Viewed

	@@ -0,0 +1,308 @@

+"""
+Integration tests for the Daily Household Electricity Consumption Predictor.
+This module contains integration tests that test the complete workflow
+from data generation through model training to prediction.
+"""
+import pytest
+import pandas as pd
+import numpy as np
+import tempfile
+import os
+from src.data_generator import DataGenerator
+from src.model import ElectricityConsumptionModel
+from src.app import ElectricityPredictorApp
+class TestIntegration:
+    """Integration tests for the complete system."""
+    def setup_method(self):
+        """Set up test environment for each test method."""
+        self.generator = DataGenerator(seed=42)
+        self.model = ElectricityConsumptionModel()
+        self.app = ElectricityPredictorApp()
+    def test_complete_workflow(self):
+        """Test the complete workflow from data generation to prediction."""
+        # Step 1: Generate data
+        data = self.generator.generate_data(n_samples=1000, noise_level=0.1)
+        assert len(data) == 1000
+        assert all(
+            col in data.columns
+            for col in ["temperature", "day_of_week", "major_event", "consumption_kwh"]
+        )
+        # Step 2: Split data
+        train_data, val_data, test_data = self.generator.split_data(data)
+        assert len(train_data) + len(val_data) + len(test_data) == len(data)
+        # Step 3: Train model
+        X_train = train_data.drop("consumption_kwh", axis=1)
+        y_train = train_data[["consumption_kwh"]]
+        train_metrics = self.model.train(X_train, y_train)
+        assert self.model.is_trained
+        assert "train_r2" in train_metrics
+        assert train_metrics["train_r2"] > 0.3  # Reasonable performance
+        # Step 4: Evaluate model
+        X_test = test_data.drop("consumption_kwh", axis=1)
+        y_test = test_data[["consumption_kwh"]]
+        test_metrics = self.model.evaluate(X_test, y_test)
+        assert "test_r2" in test_metrics
+        assert test_metrics["test_r2"] > 0.3  # Reasonable performance
+        # Step 5: Make predictions
+        prediction1 = self.model.predict(25.0, "Monday", 0)
+        prediction2 = self.model.predict(30.0, "Saturday", 1)
+        assert prediction1 > 0
+        assert prediction2 > 0
+        assert (
+            prediction2 > prediction1
+        )  # Higher temp + weekend + event should increase consumption
+    def test_app_integration(self):
+        """Test the complete app workflow."""
+        # Test data generation and training through the app
+        data_info, training_metrics, evaluation_metrics = self.app.generate_and_train(
+            n_samples=500,
+            noise_level=0.1,
+            train_size=0.7,
+            val_size=0.15,
+            test_size=0.15,
+        )
+        assert self.app.is_model_trained
+        assert "Data Generated Successfully!" in data_info
+        assert "Training Metrics:" in training_metrics
+        assert "Test Set Evaluation:" in evaluation_metrics
+        # Test prediction through the app
+        prediction_result = self.app.predict_consumption(25.0, "Monday", False)
+        assert "Estimated Daily Electricity Consumption:" in prediction_result
+        assert "Temperature: 25.0°C" in prediction_result
+        # Test model info through the app
+        model_info = self.app.get_model_info()
+        assert "Model Information:" in model_info
+        assert "Feature Coefficients:" in model_info
+    def test_model_persistence(self):
+        """Test model saving and loading."""
+        # Generate data and train model
+        data = self.generator.generate_data(n_samples=500)
+        train_data, _, _ = self.generator.split_data(data)
+        X_train = train_data.drop("consumption_kwh", axis=1)
+        y_train = train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        # Save model
+        with tempfile.NamedTemporaryFile(suffix=".joblib", delete=False) as tmp_file:
+            model_path = tmp_file.name
+        try:
+            self.model.save_model(model_path)
+            assert os.path.exists(model_path)
+            # Load model in new instance
+            new_model = ElectricityConsumptionModel()
+            new_model.load_model(model_path)
+            assert new_model.is_trained
+            # Test predictions are identical
+            pred1 = self.model.predict(25.0, "Monday", 0)
+            pred2 = new_model.predict(25.0, "Monday", 0)
+            assert abs(pred1 - pred2) < 1e-10
+        finally:
+            if os.path.exists(model_path):
+                os.unlink(model_path)
+    def test_data_persistence(self):
+        """Test data saving and loading."""
+        # Generate data
+        data = self.generator.generate_data(n_samples=100)
+        # Save data
+        with tempfile.NamedTemporaryFile(suffix=".csv", delete=False) as tmp_file:
+            data_path = tmp_file.name
+        try:
+            self.generator.save_data(data, data_path)
+            assert os.path.exists(data_path)
+            # Load data
+            loaded_data = self.generator.load_data(data_path)
+            # Check data is identical
+            pd.testing.assert_frame_equal(data, loaded_data)
+        finally:
+            if os.path.exists(data_path):
+                os.unlink(data_path)
+    def test_model_performance_consistency(self):
+        """Test that model performance is consistent across runs."""
+        # Generate data
+        data = self.generator.generate_data(n_samples=1000, noise_level=0.1)
+        train_data, _, test_data = self.generator.split_data(data)
+        # Train model multiple times with same data
+        X_train = train_data.drop("consumption_kwh", axis=1)
+        y_train = train_data[["consumption_kwh"]]
+        X_test = test_data.drop("consumption_kwh", axis=1)
+        y_test = test_data[["consumption_kwh"]]
+        r2_scores = []
+        for _ in range(3):
+            model = ElectricityConsumptionModel()
+            model.train(X_train, y_train)
+            metrics = model.evaluate(X_test, y_test)
+            r2_scores.append(metrics["test_r2"])
+        # R² scores should be very similar (within 0.01)
+        assert max(r2_scores) - min(r2_scores) < 0.01
+    def test_feature_importance_consistency(self):
+        """Test that feature importance is consistent with domain knowledge."""
+        # Generate data and train model
+        data = self.generator.generate_data(n_samples=1000)
+        train_data, _, _ = self.generator.split_data(data)
+        X_train = train_data.drop("consumption_kwh", axis=1)
+        y_train = train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        # Get coefficients
+        coefficients = self.model.get_model_coefficients()
+        # Find temperature coefficient
+        temp_idx = coefficients["feature_names"].index("temperature")
+        temp_coef = coefficients["coefficients"][temp_idx]
+        # Find major event coefficient
+        event_idx = coefficients["feature_names"].index("major_event")
+        event_coef = coefficients["coefficients"][event_idx]
+        # Temperature should have positive effect (higher temp = higher consumption)
+        assert temp_coef > 0
+        # Major event should have positive effect (events increase consumption)
+        assert event_coef > 0
+    def test_prediction_bounds(self):
+        """Test that predictions are within reasonable bounds."""
+        # Generate data and train model
+        data = self.generator.generate_data(n_samples=1000)
+        train_data, _, _ = self.generator.split_data(data)
+        X_train = train_data.drop("consumption_kwh", axis=1)
+        y_train = train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        # Test predictions across different inputs
+        predictions = []
+        for temp in [15, 20, 25, 30, 35]:
+            for day in [
+                "Monday",
+                "Tuesday",
+                "Wednesday",
+                "Thursday",
+                "Friday",
+                "Saturday",
+                "Sunday",
+            ]:
+                for event in [0, 1]:
+                    pred = self.model.predict(temp, day, event)
+                    predictions.append(pred)
+        # All predictions should be positive
+        assert all(p > 0 for p in predictions)
+        # Predictions should be within reasonable range (5-50 kWh)
+        assert all(5 <= p <= 50 for p in predictions)
+    def test_data_quality_checks(self):
+        """Test that generated data meets quality requirements."""
+        # Generate data
+        data = self.generator.generate_data(n_samples=1000)
+        # Check for missing values
+        assert not data.isnull().any().any()
+        # Check data types
+        assert data["temperature"].dtype in [np.float64, np.float32]
+        assert data["day_of_week"].dtype == "object"
+        assert data["major_event"].dtype in [np.int64, np.int32]
+        assert data["consumption_kwh"].dtype in [np.float64, np.float32]
+        # Check value ranges
+        assert data["temperature"].min() >= 15
+        assert data["temperature"].max() <= 35
+        assert all(data["major_event"].isin([0, 1]))
+        assert all(data["consumption_kwh"] > 0)
+        # Check day of week values
+        valid_days = [
+            "Monday",
+            "Tuesday",
+            "Wednesday",
+            "Thursday",
+            "Friday",
+            "Saturday",
+            "Sunday",
+        ]
+        assert all(day in valid_days for day in data["day_of_week"].unique())
+        # Check correlations make sense
+        temp_consumption_corr = data["temperature"].corr(data["consumption_kwh"])
+        assert temp_consumption_corr > 0  # Positive correlation
+    def test_error_handling(self):
+        """Test error handling in the complete workflow."""
+        # Test with invalid temperature
+        with pytest.raises(ValueError):
+            self.model.predict(10.0, "Monday", 0)  # Temperature too low
+        with pytest.raises(ValueError):
+            self.model.predict(40.0, "Monday", 0)  # Temperature too high
+        # Test with invalid day
+        with pytest.raises(ValueError):
+            self.model.predict(25.0, "InvalidDay", 0)
+        # Test with invalid major event
+        with pytest.raises(ValueError):
+            self.model.predict(25.0, "Monday", 2)  # Invalid value
+        # Test prediction without training
+        untrained_model = ElectricityConsumptionModel()
+        with pytest.raises(ValueError):
+            untrained_model.predict(25.0, "Monday", 0)
+    def test_app_state_management(self):
+        """Test that app state is properly managed."""
+        # Initially not trained
+        assert not self.app.is_model_trained
+        # After training
+        self.app.generate_and_train(500, 0.1, 0.7, 0.15, 0.15)
+        assert self.app.is_model_trained
+        # Check that data is stored
+        assert hasattr(self.app, "train_data")
+        assert hasattr(self.app, "val_data")
+        assert hasattr(self.app, "test_data")
+        # Check data sizes
+        assert len(self.app.train_data) > 0
+        assert len(self.app.val_data) > 0
+        assert len(self.app.test_data) > 0

tests/test_model.py ADDED Viewed

	@@ -0,0 +1,359 @@

+"""
+Tests for the Model module.
+This module contains comprehensive tests for the ElectricityConsumptionModel class
+to ensure proper model training, evaluation, prediction, and persistence.
+"""
+import pytest
+import pandas as pd
+import numpy as np
+import tempfile
+import os
+from src.model import ElectricityConsumptionModel
+from src.data_generator import DataGenerator
+class TestElectricityConsumptionModel:
+    """Test cases for ElectricityConsumptionModel class."""
+    def setup_method(self):
+        """Set up test data for each test method."""
+        self.generator = DataGenerator(seed=42)
+        self.data = self.generator.generate_data(n_samples=1000)
+        self.train_data, self.val_data, self.test_data = self.generator.split_data(
+            self.data
+        )
+        self.model = ElectricityConsumptionModel()
+    def test_initialization(self):
+        """Test model initialization."""
+        model = ElectricityConsumptionModel()
+        assert model.model is None
+        assert model.preprocessor is None
+        assert model.feature_names is None
+        assert not model.is_trained
+    def test_prepare_features_valid_data(self):
+        """Test feature preparation with valid data."""
+        # Test with valid data
+        valid_data = pd.DataFrame(
+            {
+                "temperature": [25.0, 30.0],
+                "day_of_week": ["Monday", "Saturday"],
+                "major_event": [0, 1],
+            }
+        )
+        prepared_data = self.model.prepare_features(valid_data)
+        assert isinstance(prepared_data, pd.DataFrame)
+        assert list(prepared_data.columns) == [
+            "temperature",
+            "day_of_week",
+            "major_event",
+        ]
+        assert len(prepared_data) == 2
+    def test_prepare_features_missing_columns(self):
+        """Test feature preparation with missing columns."""
+        invalid_data = pd.DataFrame(
+            {
+                "temperature": [25.0],
+                "day_of_week": ["Monday"],
+                # Missing major_event column
+            }
+        )
+        with pytest.raises(ValueError, match="Missing required columns"):
+            self.model.prepare_features(invalid_data)
+    def test_prepare_features_invalid_temperature(self):
+        """Test feature preparation with invalid temperature values."""
+        invalid_data = pd.DataFrame(
+            {
+                "temperature": [10.0, 40.0],  # Outside valid range
+                "day_of_week": ["Monday", "Tuesday"],
+                "major_event": [0, 0],
+            }
+        )
+        with pytest.raises(ValueError, match="Temperature must be between 15 and 35"):
+            self.model.prepare_features(invalid_data)
+    def test_prepare_features_invalid_day_of_week(self):
+        """Test feature preparation with invalid day of week values."""
+        invalid_data = pd.DataFrame(
+            {"temperature": [25.0], "day_of_week": ["InvalidDay"], "major_event": [0]}
+        )
+        with pytest.raises(ValueError, match="Day of week must be one of"):
+            self.model.prepare_features(invalid_data)
+    def test_prepare_features_invalid_major_event(self):
+        """Test feature preparation with invalid major event values."""
+        invalid_data = pd.DataFrame(
+            {
+                "temperature": [25.0],
+                "day_of_week": ["Monday"],
+                "major_event": [2],  # Invalid value
+            }
+        )
+        with pytest.raises(ValueError, match="Major event must be 0 or 1"):
+            self.model.prepare_features(invalid_data)
+    def test_train_model(self):
+        """Test model training."""
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        metrics = self.model.train(X_train, y_train)
+        # Check that model is trained
+        assert self.model.is_trained
+        assert self.model.model is not None
+        assert self.model.feature_names is not None
+        # Check metrics structure
+        expected_metrics = ["train_mse", "train_rmse", "train_mae", "train_r2"]
+        assert all(metric in metrics for metric in expected_metrics)
+        # Check metric values are reasonable
+        assert metrics["train_mse"] > 0
+        assert metrics["train_rmse"] > 0
+        assert metrics["train_mae"] > 0
+        assert 0 <= metrics["train_r2"] <= 1
+    def test_evaluate_model_not_trained(self):
+        """Test evaluation when model is not trained."""
+        X_test = self.test_data.drop("consumption_kwh", axis=1)
+        y_test = self.test_data[["consumption_kwh"]]
+        with pytest.raises(ValueError, match="Model must be trained before evaluation"):
+            self.model.evaluate(X_test, y_test)
+    def test_evaluate_model(self):
+        """Test model evaluation."""
+        # Train model first
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        # Evaluate model
+        X_test = self.test_data.drop("consumption_kwh", axis=1)
+        y_test = self.test_data[["consumption_kwh"]]
+        metrics = self.model.evaluate(X_test, y_test)
+        # Check metrics structure
+        expected_metrics = ["test_mse", "test_rmse", "test_mae", "test_r2"]
+        assert all(metric in metrics for metric in expected_metrics)
+        # Check metric values are reasonable
+        assert metrics["test_mse"] > 0
+        assert metrics["test_rmse"] > 0
+        assert metrics["test_mae"] > 0
+        assert 0 <= metrics["test_r2"] <= 1
+    def test_predict_not_trained(self):
+        """Test prediction when model is not trained."""
+        with pytest.raises(
+            ValueError, match="Model must be trained before making predictions"
+        ):
+            self.model.predict(25.0, "Monday", 0)
+    def test_predict_valid_inputs(self):
+        """Test prediction with valid inputs."""
+        # Train model first
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        # Test prediction
+        prediction = self.model.predict(25.0, "Monday", 0)
+        assert isinstance(prediction, float)
+        assert prediction >= 0  # Should be non-negative
+    def test_predict_different_inputs(self):
+        """Test prediction with different input combinations."""
+        # Train model first
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        # Test different temperature values
+        pred1 = self.model.predict(20.0, "Monday", 0)
+        pred2 = self.model.predict(30.0, "Monday", 0)
+        # Higher temperature should generally lead to higher consumption
+        assert pred2 > pred1
+        # Test different days
+        pred3 = self.model.predict(25.0, "Saturday", 0)
+        pred4 = self.model.predict(25.0, "Monday", 0)
+        # Should be different (though not necessarily higher/lower due to randomness)
+        assert pred3 != pred4
+        # Test with and without major event
+        pred5 = self.model.predict(25.0, "Monday", 1)
+        pred6 = self.model.predict(25.0, "Monday", 0)
+        # Major event should increase consumption
+        assert pred5 > pred6
+    def test_get_model_coefficients_not_trained(self):
+        """Test getting coefficients when model is not trained."""
+        with pytest.raises(
+            ValueError, match="Model must be trained before accessing coefficients"
+        ):
+            self.model.get_model_coefficients()
+    def test_get_model_coefficients(self):
+        """Test getting model coefficients."""
+        # Train model first
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        coefficients = self.model.get_model_coefficients()
+        # Check structure
+        assert "feature_names" in coefficients
+        assert "coefficients" in coefficients
+        assert "intercept" in coefficients
+        # Check types
+        assert isinstance(coefficients["feature_names"], list)
+        assert isinstance(coefficients["coefficients"], list)
+        assert isinstance(coefficients["intercept"], float)
+        # Check lengths
+        assert len(coefficients["feature_names"]) == len(coefficients["coefficients"])
+        assert len(coefficients["feature_names"]) > 0
+    def test_save_model_not_trained(self):
+        """Test saving model when not trained."""
+        with tempfile.NamedTemporaryFile(suffix=".joblib", delete=False) as tmp_file:
+            filepath = tmp_file.name
+        try:
+            with pytest.raises(ValueError, match="Model must be trained before saving"):
+                self.model.save_model(filepath)
+        finally:
+            if os.path.exists(filepath):
+                os.unlink(filepath)
+    def test_save_and_load_model(self):
+        """Test saving and loading model."""
+        # Train model first
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        with tempfile.NamedTemporaryFile(suffix=".joblib", delete=False) as tmp_file:
+            filepath = tmp_file.name
+        try:
+            # Save model
+            self.model.save_model(filepath)
+            assert os.path.exists(filepath)
+            # Create new model and load
+            new_model = ElectricityConsumptionModel()
+            new_model.load_model(filepath)
+            # Check that model is trained
+            assert new_model.is_trained
+            assert new_model.model is not None
+            # Test prediction with loaded model
+            original_pred = self.model.predict(25.0, "Monday", 0)
+            loaded_pred = new_model.predict(25.0, "Monday", 0)
+            # Predictions should be identical
+            assert abs(original_pred - loaded_pred) < 1e-10
+        finally:
+            if os.path.exists(filepath):
+                os.unlink(filepath)
+    def test_load_model_file_not_found(self):
+        """Test loading model from non-existent file."""
+        with pytest.raises(FileNotFoundError):
+            self.model.load_model("non_existent_file.joblib")
+    def test_model_performance_reasonable(self):
+        """Test that model performance is reasonable."""
+        # Train model
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        train_metrics = self.model.train(X_train, y_train)
+        # Evaluate model
+        X_test = self.test_data.drop("consumption_kwh", axis=1)
+        y_test = self.test_data[["consumption_kwh"]]
+        test_metrics = self.model.evaluate(X_test, y_test)
+        # R-squared should be reasonable (not too low, not perfect)
+        assert 0.3 <= train_metrics["train_r2"] <= 0.995
+        assert 0.3 <= test_metrics["test_r2"] <= 0.995
+        # Test R-squared should not be much worse than train R-squared
+        assert test_metrics["test_r2"] >= train_metrics["train_r2"] - 0.2
+    def test_model_consistency(self):
+        """Test that model predictions are consistent."""
+        # Train model
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        # Make same prediction multiple times
+        pred1 = self.model.predict(25.0, "Monday", 0)
+        pred2 = self.model.predict(25.0, "Monday", 0)
+        pred3 = self.model.predict(25.0, "Monday", 0)
+        # All predictions should be identical
+        assert abs(pred1 - pred2) < 1e-10
+        assert abs(pred2 - pred3) < 1e-10
+    def test_model_feature_importance(self):
+        """Test that model captures feature importance correctly."""
+        # Train model
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        coefficients = self.model.get_model_coefficients()
+        # Temperature coefficient should be positive (higher temp = higher consumption)
+        temp_idx = coefficients["feature_names"].index("temperature")
+        assert coefficients["coefficients"][temp_idx] > 0
+        # Major event coefficient should be positive (events increase consumption)
+        event_idx = coefficients["feature_names"].index("major_event")
+        assert coefficients["coefficients"][event_idx] > 0
+    def test_model_with_extreme_values(self):
+        """Test model behavior with extreme input values."""
+        # Train model
+        X_train = self.train_data.drop("consumption_kwh", axis=1)
+        y_train = self.train_data[["consumption_kwh"]]
+        self.model.train(X_train, y_train)
+        # Test with minimum temperature
+        min_pred = self.model.predict(15.0, "Monday", 0)
+        assert min_pred >= 0
+        # Test with maximum temperature
+        max_pred = self.model.predict(35.0, "Monday", 0)
+        assert max_pred >= 0
+        # Test with major event
+        event_pred = self.model.predict(25.0, "Monday", 1)
+        assert event_pred >= 0