# <span style="font-width:bold; font-size: 3rem; color:#2656a3;">**Msc. BDS Module - Data Engineering and Machine Learning Operations in Business (MLOPs)** </span> <span style="font-width:bold; font-size: 3rem; color:#333;">- Part 02: Feature Pipeline</span>

## <span style='color:#2656a3'> üóíÔ∏è The notebook is divided into the following sections:
1. Parsing new data.
2. Inserting the new data into the Feature Store.

## <span style='color:#2656a3'> ‚öôÔ∏è Import of libraries and packages

We start by accessing the folder we have created that holds the functions (incl. live API calls and data preprocessing) we need for electricity prices and weather measures. Then, we proceed to import some of the necessary libraries needed for this notebook and warnings to avoid unnecessary distractions and keep output clean.

In [1]:
# First we go one back in our directory to access the folder with our functions
%cd ..

# Now we import the functions from the features folder
# This is the functions we have created to generate features for electricity prices and weather measures
from features import electricity_prices, weather_measures

# We go back into the notebooks folder
%cd notebooks

/Users/camillahannesbo/Documents/AAU/Master - BDS/2. semester/Data Engineering and Machine learning operations in Business/MLOPs-Assignment-
/Users/camillahannesbo/Documents/AAU/Master - BDS/2. semester/Data Engineering and Machine learning operations in Business/MLOPs-Assignment-/notebooks


In [2]:
# Importing pandas for data handling
import pandas as pd

# Ignore warnings
import warnings 
warnings.filterwarnings('ignore')

## <span style='color:#2656a3'> ü™Ñ Parsing New Data
To fetch non-historical electricity prices we are setting `historical` to `False`. 

In order to provide real time weather measures, a weather forecast measure for the next 5 days is being fetched.

There are of course no changes to the calendar data, and therefore no new data is retrieved from it.

### <span style="color:#2656a3;">üí∏ Electricity Prices per day from Energinet

In [3]:
# Fetching non-historical electricity prices for area DK1
electricity_df = electricity_prices.electricity_prices(
    historical=False,
    area=["DK1"]
)

In [4]:
# Display the electricity dataframe
electricity_df

Unnamed: 0,timestamp,datetime,date,hour,dk1_spotpricedkk_kwh
0,1714953600000,2024-05-06 00:00:00,2024-05-06,0,0.61803
1,1714957200000,2024-05-06 01:00:00,2024-05-06,1,0.59364
2,1714960800000,2024-05-06 02:00:00,2024-05-06,2,0.59975
3,1714964400000,2024-05-06 03:00:00,2024-05-06,3,0.59632
4,1714968000000,2024-05-06 04:00:00,2024-05-06,4,0.6093
5,1714971600000,2024-05-06 05:00:00,2024-05-06,5,0.65271
6,1714975200000,2024-05-06 06:00:00,2024-05-06,6,0.79875
7,1714978800000,2024-05-06 07:00:00,2024-05-06,7,0.97157
8,1714982400000,2024-05-06 08:00:00,2024-05-06,8,0.7493
9,1714986000000,2024-05-06 09:00:00,2024-05-06,9,0.66383


### <span style="color:#2656a3;"> üåà Forecast Weather Measures from Open Meteo

In [5]:
# Fetching weather forecast measures for the next 5 days
weather_forecast_df = weather_measures.forecast_weather_measures(
    forecast_length=5
)

In [6]:
# Display the weather forecast dataframe
weather_forecast_df

Unnamed: 0,timestamp,datetime,date,hour,temperature_2m,relative_humidity_2m,precipitation,rain,snowfall,weather_code,cloud_cover,wind_speed_10m,wind_gusts_10m
0,1714953600000,2024-05-06 00:00:00,2024-05-06,0,9.6,93.0,0.2,0.2,0.0,51.0,100.0,14.4,24.8
1,1714957200000,2024-05-06 01:00:00,2024-05-06,1,9.7,93.0,0.0,0.0,0.0,3.0,100.0,14.0,24.8
2,1714960800000,2024-05-06 02:00:00,2024-05-06,2,9.5,91.0,0.0,0.0,0.0,3.0,100.0,14.0,24.8
3,1714964400000,2024-05-06 03:00:00,2024-05-06,3,9.5,91.0,0.0,0.0,0.0,3.0,100.0,13.0,23.4
4,1714968000000,2024-05-06 04:00:00,2024-05-06,4,9.6,92.0,0.0,0.0,0.0,3.0,100.0,14.0,24.1
...,...,...,...,...,...,...,...,...,...,...,...,...,...
115,1715367600000,2024-05-10 19:00:00,2024-05-10,19,11.5,68.0,0.0,0.0,0.0,3.0,89.0,5.2,13.0
116,1715371200000,2024-05-10 20:00:00,2024-05-10,20,10.5,71.0,0.0,0.0,0.0,3.0,88.0,3.4,8.6
117,1715374800000,2024-05-10 21:00:00,2024-05-10,21,9.5,74.0,0.0,0.0,0.0,3.0,87.0,2.5,4.3
118,1715378400000,2024-05-10 22:00:00,2024-05-10,22,8.6,78.0,0.0,0.0,0.0,3.0,91.0,2.6,4.3


## <span style="color:#2656a3;"> üì° Connecting to Hopsworks Feature Store

We connect to Hopsworks Feature Store so we can access the Feature Groups and upload the new data into the Feature Groups.

In [7]:
# Importing the hopsworks module for interacting with the Hopsworks platform
import hopsworks

# Logging into the Hopsworks project
project = hopsworks.login()

# Getting the feature store from the project
fs = project.get_feature_store()

Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/550040
Connected. Call `.close()` to terminate connection gracefully.


In [8]:
# Retrieve the feature groups
electricity_fg = fs.get_feature_group(
    name="electricity_prices",
    version=1,
)

weather_fg = fs.get_feature_group(
    name="weather_measurements",
    version=1,
)

### <span style="color:#2656a3;"> ‚¨ÜÔ∏è Uploading new data to the Feature Store
Here we upload the new data to the retrieved Feature groups by using the `insert` function.

In [9]:
# Inserting the electricity_df into the feature group named electricity_fg
electricity_fg.insert(electricity_df, 
                      write_options={"wait_for_job" : False})

Uploading Dataframe: 100.00% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| Rows 24/24 | Elapsed Time: 00:06 | Remaining Time: 00:00


Launching job: electricity_prices_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/550040/jobs/named/electricity_prices_1_offline_fg_materialization/executions


(<hsfs.core.job.Job at 0x13aea7b90>, None)

In [10]:
# Inserting the weather_df into the feature group named weather_fg
weather_fg.insert(weather_forecast_df, 
                  write_options={"wait_for_job" : False})

Uploading Dataframe: 100.00% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| Rows 120/120 | Elapsed Time: 00:06 | Remaining Time: 00:00


Launching job: weather_measurements_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/550040/jobs/named/weather_measurements_1_offline_fg_materialization/executions


(<hsfs.core.job.Job at 0x13aea7d50>, None)

---
## <span style="color:#2656a3;">‚è≠Ô∏è **Next:** Part 03: Traning </span>

Next we will create a feature view and training dataset. Further we will train a model and save it in model registry.