cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How can I use the feature store for time series out of sample prediction?

NAS
New Contributor III

For instance, have a new model trained every Saturday with training data up to the previous Fri, and use such model to predict daily the following week?

In the same context, if the features are keyed by date, could I create a training set with a different frequency (e.g. monthly)? Or I would have to recreate new features in monthly frequency?

6 REPLIES 6

Dan_Z
Databricks Employee
Databricks Employee

I'm guessing you are using some sort of time-series model here, that uses some sort of auto-correlation? Usually for that you need to work with a complete time series. Are you doing one prediction for the entire week (by day)? Or are you doing a separate predictions every day? What `feature` do you want to store in the Feature Store to help achieve this?

NAS
New Contributor III

Hi Dan. I'm not using a times series model in this particular case. Just a classification model that takes the latest observation of a number of time series, and a few historical observations for some of them. I do daily predictions, but recalibrate the model every week.

Dan_Z
Databricks Employee
Databricks Employee

@Nestor Sulikowskiโ€‹ I'm honestly having a little trouble trying to grasp what you are trying to achieve. Surely, if you have some function that you use to cull the historical data or other time series features you can save those and the resultant data in the Feature Store. I don't see why you would have any issues with out of sample prediction if you are not using a time-series model.

NAS
New Contributor III

Thanks. I should write a more detailed question, but I'm lazy ๐Ÿ™‚

Thanks for taking the time to share your thoughts.

sean_owen
Databricks Employee
Databricks Employee

In this case, you just want your feature store to have a timestamp column as a timestamp key. You would compute your features as of whatever dates you like and add them as features, and those are used to train. At runtime, to make a prediction as of "now", you pass the current time as the value of the timestamp key. Under the hood it's an as-of join, which matches the latest time <= your value when finding the matching row.

RasmusBrostroem
New Contributor II

Hello, I just came across this and I have a similar question. I am quite new to Databricks and the feature store, but I wanted to use it, however, I am having some difficulty figuring out what specifically I can do.

In my case I am using XGBoost regression to do forecast for the next 48 hours, but I am using lagged value and rollback averages of the target variable that span from 1 hour to 48 hours back in the timeseries. Thus, I need to recalculate these lag and rollback variables as the model is making its forecast where I treat the forecasted value at t+1 as the true target when creating the forecasted value for t+2 and so on. Therefore I can't just have the true target lag and rollback variables be in the feature store, or can I? Is there something else I would have to do to be able to use the feature store setup in this situation?

Hope it makes sense, I appreciate all inputs and help ๐Ÿ™‚ 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now