Addressing Small Datasets

#12

by alialexsalman - opened May 19

May 19

Hello, I wonder if you could release a new checkpoint that could serve small datasets. My 5 years timeSeries dataset has 60 months/ data points. And I am thinking of a way to leverage this model in my dataset. Thank you!

gabaid971

May 19

I've tried it on a small dataset if you want to check :
https://github.com/gabaid971/ts-forecaster

alialexsalman

May 19

According to the documentation, you can't have context_len (The length of the context window, or the number of past time steps the model uses to make a forecast) < 32; also no horizon_len (The length of the forecast horizon, or the number of future time steps the model predicts) < 128.

gabaid971

May 19

Oh okay didn't noticed you got only 60 data points. The context_len must be a multiple of 32. Is it realistic to create intermediate values in your dataset (example: for each month 30 points with the value of the month. And for prediction purposes, a month's value would be the average of the 30 points)?

alialexsalman

May 19

•

edited May 19

Do you mean creating 30 random constants with a mean of the month value and for all months? Not sure!

gabaid971

May 19

Yes I understood. But is it realistic to augment your dataset superficially, this way? For example if you got 1 point for January 2023, you create 30 consecutive points with the same value (the one of jan 2023). Then when you're on the prediction phase, you juste have to average 30 points per 30 points to get the values of the following months

alialexsalman

May 19

Lol, that's a cool legit cheat if it works!

gabaid971

May 19

Never tried but who knows ahah

Sintayew4

May 19

https://huggingface.co/google/timesfm-1.0-200m/discussions/12#664a723390135abe9b7af30d

siriuz42

Google org May 30

Sorry for the late reply: you can make forecast on time series with lengths < 32. The context_len used during model definition is more of an internal parameter and has to be a multiplier of 32 so that we can compile the model inference, but after that any input context lengths should be fine. That said it is likely the model performance may disappoint if the provided context is short. Try it first.

alialexsalman

May 31

Thank you @siriuz42 for getting back to me. I will try it again and let you know how it works with a small dataset. Cheers!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment