# Introduction
This notebook is used to test different ways to deploy the feature extraction model to the huggingface hub. The goal is to find the best way to deploy the model so that it can be used in the inference API and can be easy accessible for user. In the best way it would also be possible to simply use the huggingface library directly. The following methods will be tested:

1. [Using ``timm`` to extract features](#Using-timm-to-extract-features) -> &#x274C;
2. [Using ``transformers`` to extract features](#Using-transformers-to-extract-features) --> &#x274C;
    1. [Feature extraction task](#Feature-extraction-task)
    2. [``AutoModel``](#AutoModel)
    3. [Batched feature extraction](#Batched-feature-extraction)
3. [Using simple download](#Using-simple-download) -> &#x2705;
4. [Using custom model](#Using-custom-model) --> &#x1F6A7;

**Helpful links and resources**
- https://huggingface.co/docs/transformers/custom_models - Alternative creating custom models
- https://huggingface.co/templates/feature-extraction - Template for inference API
- https://huggingface-widgets.netlify.app/ - Widgets for visualizing models in inference API
- https://huggingface.co/docs/hub/models-widgets#how-can-i-control-my-models-widget-inference-api-parameters - Controlling inference API parameters

# Imports

In [12]:
import timm
import torch
from transformers import pipeline, AutoTokenizer, AutoModel

from src.deprecated.pipeline_wrapper import MyPipeline

# 1. Using ``timm`` to extract features

In [10]:
test_tensor = torch.randn(2, 3, 1, 1)

NameError: name 'torch' is not defined

In [31]:
feature_extractor = timm.create_model('resnet18', pretrained=True, num_classes=0, global_pool='')
features = feature_extractor.forward_features(test_tensor)

In [32]:
features

tensor([[[[0.0000]],

         [[0.6944]],

         [[0.0000]],

         ...,

         [[0.0000]],

         [[0.0000]],

         [[0.0000]]],


        [[[0.0000]],

         [[0.0000]],

         [[0.0143]],

         ...,

         [[0.0000]],

         [[0.0000]],

         [[0.0000]]]], grad_fn=<ReluBackward0>)

In [15]:
feature_extractor = timm.create_model('HUBII-Platform/ECG2HRV', pretrained=True, num_classes=0, global_pool='')

RuntimeError: Unknown model (ECG2HRV)

# 2. Using ``transformers`` to extract features

1. Feature extraction task

In [33]:
# Example with pipeline
checkpoint = "facebook/bart-base"
feature_extractor = pipeline("feature-extraction", framework="pt",model=checkpoint)
text = "Transformers is an awesome library!"

config.json:   0%|          | 0.00/1.72k [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


model.safetensors:   0%|          | 0.00/558M [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

In [7]:
feature_extractor = pipeline("feature-extraction", model = 'HUBII-Platform/ECG2HRV')

OSError: It looks like the config file at 'C:\Users\merti\.cache\huggingface\hub\models--HUBII-Platform--ECG2HRV\snapshots\75f67e01de12e33cfb05cfbfed35ff621246b3f9\config.json' is not a valid JSON file.

In [None]:
PIPELINE_REGISTRY.register_pipeline(
    "ecg2hrv",
    pipeline_class=MyPipeline,
    # model_class=MyModel
)
feature_extractor = pipeline("ecg2hrv")

In [None]:
feature_extractor = pipeline("ecg2hrv", model="HUBII-Platform/ECG2HRV")

2. ``AutoModel``

In [2]:
# Example with AutoModel
model = AutoModel.from_pretrained('HUBII-Platform/ECG2HRV')

OSError: It looks like the config file at 'C:\Users\merti\.cache\huggingface\hub\models--HUBII-Platform--ECG2HRV\snapshots\75f67e01de12e33cfb05cfbfed35ff621246b3f9\config.json' is not a valid JSON file.

3. Batched feature extraction - not supported (see https://huggingface.co/docs/transformers/main_classes/feature_extractor#transformers.BatchFeature)
Not possible since it is not a model itself but a component used in the pipeline

# 3. Using simple download
(See https://huggingface.co/julien-c/wine-quality?structured_data=%7B%7D)

**Instantiate model and save the model as a joblib file in the huggingface repository**

In [1]:
import joblib
import numpy as np

from src.ecg2hrv import ECG2HRV

In [2]:
# Instantiate model
model = ECG2HRV()
# Save
joblib.dump(model, "..\ECG2HRV.joblib")
# Load in notebook
model = joblib.load("..\ECG2HRV.joblib")

**Test the model locally with random ecg**

In [3]:
duration_seconds = 10 # Time duration for ECG signal (in seconds)
sample_rate = 100 # Sample rate (samples per second)
num_samples = duration_seconds * sample_rate # Number of samples

t = np.linspace(0, duration_seconds, num_samples) # Time array

# Generate ECG signal (example synthetic data)
ecg_signal = (
    0.2 * np.sin(2 * np.pi * 1 * t) +
    0.5 * np.sin(2 * np.pi * 0.5 * t) -
    0.1 * np.sin(2 * np.pi * 2.5 * t)
)

# Add some random noise
ecg_signal += np.random.normal(scale=0.1, size=num_samples)

In [4]:
model(input_data=ecg_signal, frequency=100.0)

[{'HRV_MeanNN': 413.4782608695652,
  'HRV_SDNN': 100.97743652790477,
  'HRV_SDANN1': nan,
  'HRV_SDNNI1': nan,
  'HRV_SDANN2': nan,
  'HRV_SDNNI2': nan,
  'HRV_SDANN5': nan,
  'HRV_SDNNI5': nan,
  'HRV_RMSSD': 92.78518690551262,
  'HRV_SDSD': 94.96410805236795,
  'HRV_CVNN': 0.24421462041449105,
  'HRV_CVSD': 0.22440160870944167,
  'HRV_MedianNN': 400.0,
  'HRV_MadNN': 118.60799999999999,
  'HRV_MCVNN': 0.29651999999999995,
  'HRV_IQRNN': 150.0,
  'HRV_SDRMSSD': 1.0882926455785953,
  'HRV_Prc20NN': 320.0,
  'HRV_Prc80NN': 490.0,
  'HRV_pNN50': 52.17391304347826,
  'HRV_pNN20': 69.56521739130434,
  'HRV_MinNN': 310.0,
  'HRV_MaxNN': 640.0,
  'HRV_HTI': 5.75,
  'HRV_TINN': 0.0}]

**Test the model loaded from the hub with random ecg**

In [7]:
from huggingface_hub import hf_hub_download
import joblib

# Load from hub
REPO_ID = "hubii-world/ECG2HRV"
FILENAME = "ECG2HRV.joblib"

model = joblib.load(
    hf_hub_download(repo_id=REPO_ID, filename=FILENAME)
)

ECG2HRV.joblib:   0%|          | 0.00/39.0 [00:00<?, ?B/s]

In [8]:
# Run model
model(input_data=ecg_signal, frequency=100.0)

[{'HRV_MeanNN': 413.4782608695652,
  'HRV_SDNN': 100.97743652790477,
  'HRV_SDANN1': nan,
  'HRV_SDNNI1': nan,
  'HRV_SDANN2': nan,
  'HRV_SDNNI2': nan,
  'HRV_SDANN5': nan,
  'HRV_SDNNI5': nan,
  'HRV_RMSSD': 92.78518690551262,
  'HRV_SDSD': 94.96410805236795,
  'HRV_CVNN': 0.24421462041449105,
  'HRV_CVSD': 0.22440160870944167,
  'HRV_MedianNN': 400.0,
  'HRV_MadNN': 118.60799999999999,
  'HRV_MCVNN': 0.29651999999999995,
  'HRV_IQRNN': 150.0,
  'HRV_SDRMSSD': 1.0882926455785953,
  'HRV_Prc20NN': 320.0,
  'HRV_Prc80NN': 490.0,
  'HRV_pNN50': 52.17391304347826,
  'HRV_pNN20': 69.56521739130434,
  'HRV_MinNN': 310.0,
  'HRV_MaxNN': 640.0,
  'HRV_HTI': 5.75,
  'HRV_TINN': 0.0}]

# 4. Using custom model (not tested yet)
