cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

How can I use the feature store for time series out of sample prediction?

NAS
New Contributor III

For instance, have a new model trained every Saturday with training data up to the previous Fri, and use such model to predict daily the following week?

In the same context, if the features are keyed by date, could I create a training set with a different frequency (e.g. monthly)? Or I would have to recreate new features in monthly frequency?

5 REPLIES 5

Dan_Z
Databricks Employee
Databricks Employee

I'm guessing you are using some sort of time-series model here, that uses some sort of auto-correlation? Usually for that you need to work with a complete time series. Are you doing one prediction for the entire week (by day)? Or are you doing a separate predictions every day? What `feature` do you want to store in the Feature Store to help achieve this?

NAS
New Contributor III

Hi Dan. I'm not using a times series model in this particular case. Just a classification model that takes the latest observation of a number of time series, and a few historical observations for some of them. I do daily predictions, but recalibrate the model every week.

Dan_Z
Databricks Employee
Databricks Employee

@Nestor Sulikowski​ I'm honestly having a little trouble trying to grasp what you are trying to achieve. Surely, if you have some function that you use to cull the historical data or other time series features you can save those and the resultant data in the Feature Store. I don't see why you would have any issues with out of sample prediction if you are not using a time-series model.

NAS
New Contributor III

Thanks. I should write a more detailed question, but I'm lazy 🙂

Thanks for taking the time to share your thoughts.

sean_owen
Databricks Employee
Databricks Employee

In this case, you just want your feature store to have a timestamp column as a timestamp key. You would compute your features as of whatever dates you like and add them as features, and those are used to train. At runtime, to make a prediction as of "now", you pass the current time as the value of the timestamp key. Under the hood it's an as-of join, which matches the latest time <= your value when finding the matching row.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group