Question regarding dynamic and static features

#6
by NXTNiklas - opened

The documentation currently describes how to use past_values, future_values, past_time_features, future_time_features, static_categorical_features and static_real_features.

How I understand it, is that the static features are constant over the timeseries.

It also mentions the param num_dynamic_real_features - The number of dynamic real valued features.
However I do not understand how to pass the values to the model.

I'm trying to predict 4 timeseries 30 steps into the future. For each step in the input timeseries I've got 50 measurements at a higher sampling rate, which I'd like to pass to the model to improve the prediction of the 4 timeseries. However the model does not need to predict the high sampled values

I was wondering the same thing. It seems like dynamic_real_features is not currently an input to the model:
https://github.com/huggingface/transformers/blob/v4.26.1/src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Probably have to file an issue for this.

Hugging Face org
edited Feb 23, 2023

thank you for the question @NXTNiklas and @clafrieda

So the idea is since any dynamic_real_features are like the date-time features, in the sense that they need to be known at inference time, you can contact them to your date time feature tensor and pass them along that way to the model.

The reason for having the two configs one for the number of date time features and one for the number of dynamic real features is mostly for your data preparation functions, where you can also choose to not concat any dynamic features you may have, depending on the configuration value...

is that somewhat clear? In the blog post you can see "Step 7" where this concat is being done.

Also currently all the dynamic covariates (i.e the dynamic real one) need to be the same size in the time dim as the target values... so if your covariate are at a higher sampling rate then you need to resample it with the appropriate agg. function to the freq of the target and if your covariate are at a lower sampling rate then you need to copy them over for the appropriate time stamps of the target.

Thanks for the response Kashif. I see the concatenation. These become part of the positional encoding, correct? It seems like that means the decoder will see the future values of these features.

I wonder if it makes sense in some cases to concatenate these features with the values. For example, let's say I want to predict the price of apples and I know that the price of apples is somehow related to the past price of apples and the past price of tomatoes. I don't know the future price of tomatoes, so I can't add it as a future_time_feature. However, perhaps I can concatenate the tomato prices with the values, and add a time feature that is 0=apples and 1=tomatoes (in addition to the date-time). Then I can set the lags appropriately such that the tomato prices get attention (would have to rework the data loader as well). Would something like that work?

Hugging Face org
edited Feb 23, 2023

@clafrieda that is why there is this caveat about the dynamic real-valued features.... they need to be known at prediction time.

Yes in theory the model can look at future dynamic features in the decoder, and there is nothing wrong with doing that, but the decoder during training has a causal mask and since these features are part of the input to the decoder, the model is not looking at future dynamic real covariates during training... I would not know how to add a causal mask to only the target part of the input...

the other thing to note, with these models, is that you are learning the future distribution of the target conditioned on the covariates, however as in your example if these entities are causally connected to the target then you need to model that causal relationship differently. In particular, at inference time if you want to then know what the price of apples would be, had the price of tomatoes been x, then you really need the machinery of causal inference and do the "do" business etc. A conditional model like this would potentially give the wrong answer.

Feel free to open an issue about the "causal time series model" and I can discuss it there.

Thanks again Kashif. That makes sense.

Sign up or log in to comment