cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

How can I use the feature store for time series out of sample prediction?

NAS
New Contributor III

For instance, have a new model trained every Saturday with training data up to the previous Fri, and use such model to predict daily the following week?

In the same context, if the features are keyed by date, could I create a training set with a different frequency (e.g. monthly)? Or I would have to recreate new features in monthly frequency?

6 REPLIES 6

Kaniz
Community Manager
Community Manager

Hi @NAS ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

Dan_Z
Honored Contributor
Honored Contributor

I'm guessing you are using some sort of time-series model here, that uses some sort of auto-correlation? Usually for that you need to work with a complete time series. Are you doing one prediction for the entire week (by day)? Or are you doing a separate predictions every day? What `feature` do you want to store in the Feature Store to help achieve this?

NAS
New Contributor III

Hi Dan. I'm not using a times series model in this particular case. Just a classification model that takes the latest observation of a number of time series, and a few historical observations for some of them. I do daily predictions, but recalibrate the model every week.

Dan_Z
Honored Contributor
Honored Contributor

@Nestor Sulikowski​ I'm honestly having a little trouble trying to grasp what you are trying to achieve. Surely, if you have some function that you use to cull the historical data or other time series features you can save those and the resultant data in the Feature Store. I don't see why you would have any issues with out of sample prediction if you are not using a time-series model.

NAS
New Contributor III

Thanks. I should write a more detailed question, but I'm lazy 🙂

Thanks for taking the time to share your thoughts.

sean_owen
Honored Contributor II
Honored Contributor II

In this case, you just want your feature store to have a timestamp column as a timestamp key. You would compute your features as of whatever dates you like and add them as features, and those are used to train. At runtime, to make a prediction as of "now", you pass the current time as the value of the timestamp key. Under the hood it's an as-of join, which matches the latest time <= your value when finding the matching row.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.