LightGBM Time Series Forecasting Pipeline

This repository contains a complete reusable forecasting pipeline based on LightGBM models.

The pipeline includes:

9 trained LightGBM models
feature engineering pipeline
blending configuration
training statistics
reusable inference workflow

The models were serialized into a single .pkl file for easy deployment and reuse.

Repository Contents

File	Description
`full_pipeline.pkl`	Serialized pipeline containing all trained models
`inference_example.py`	Example script for inference
`README.md`	Documentation
`requirements.txt`	Required dependencies

Included Pipeline Objects

The pickle file contains:

pipeline.keys()

# horizon_models
# subcat_models
# train_stats
# blend_scores
# params
# blend_power

Installation

pip install lightgbm joblib pandas numpy huggingface_hub

Download Model

from huggingface_hub import hf_hub_download
import joblib

REPO_ID = "andrewmos/lightbm-ts-forecasting-kaggle"

model_path = hf_hub_download(
    repo_id=REPO_ID,
    filename="full_pipeline.pkl",
    repo_type="model"
)

pipeline = joblib.load(model_path)

print("Pipeline loaded successfully")

Load Models

horizon_models = pipeline["horizon_models"]
subcat_models = pipeline["subcat_models"]

train_stats = pipeline["train_stats"]

blend_scores = pipeline["blend_scores"]

params = pipeline["params"]

blend_power = pipeline["blend_power"]

Important Note About Feature Engineering

The .pkl file stores the trained models, but it does NOT automatically store the preprocessing logic.

You must recreate the same feature engineering pipeline used during training before running inference.

Example:

def create_features(df):

    df = df.copy()

    df["month"] = pd.to_datetime(df["date"]).dt.month
    df["year"] = pd.to_datetime(df["date"]).dt.year

    return df

Example Inference

model_info = horizon_models[1]

model = model_info["model"]

features = model_info["features"]

X_test = test_df[features]

predictions = model.predict(X_test)

print(predictions)

Reproducibility

Recommended package versions:

lightgbm>=4.0
numpy>=1.24
pandas>=2.0
joblib>=1.3

Using compatible versions helps avoid serialization issues.

Use Cases

This repository can be used for:

time series forecasting
reusable inference pipelines
Kaggle competitions
LightGBM deployment examples
tabular ML workflows

Author

Andrés Mosquera

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support