โ10-24-2021 11:08 PM
For instance, have a new model trained every Saturday with training data up to the previous Fri, and use such model to predict daily the following week?
In the same context, if the features are keyed by date, could I create a training set with a different frequency (e.g. monthly)? Or I would have to recreate new features in monthly frequency?
โ10-26-2021 12:14 PM
I'm guessing you are using some sort of time-series model here, that uses some sort of auto-correlation? Usually for that you need to work with a complete time series. Are you doing one prediction for the entire week (by day)? Or are you doing a separate predictions every day? What `feature` do you want to store in the Feature Store to help achieve this?
โ10-26-2021 01:28 PM
Hi Dan. I'm not using a times series model in this particular case. Just a classification model that takes the latest observation of a number of time series, and a few historical observations for some of them. I do daily predictions, but recalibrate the model every week.
โ10-26-2021 08:35 PM
@Nestor Sulikowskiโ I'm honestly having a little trouble trying to grasp what you are trying to achieve. Surely, if you have some function that you use to cull the historical data or other time series features you can save those and the resultant data in the Feature Store. I don't see why you would have any issues with out of sample prediction if you are not using a time-series model.
โ10-27-2021 05:15 AM
Thanks. I should write a more detailed question, but I'm lazy ๐
Thanks for taking the time to share your thoughts.
โ12-16-2021 06:46 PM
In this case, you just want your feature store to have a timestamp column as a timestamp key. You would compute your features as of whatever dates you like and add them as features, and those are used to train. At runtime, to make a prediction as of "now", you pass the current time as the value of the timestamp key. Under the hood it's an as-of join, which matches the latest time <= your value when finding the matching row.
4 weeks ago
Hello, I just came across this and I have a similar question. I am quite new to Databricks and the feature store, but I wanted to use it, however, I am having some difficulty figuring out what specifically I can do.
In my case I am using XGBoost regression to do forecast for the next 48 hours, but I am using lagged value and rollback averages of the target variable that span from 1 hour to 48 hours back in the timeseries. Thus, I need to recalculate these lag and rollback variables as the model is making its forecast where I treat the forecasted value at t+1 as the true target when creating the forecasted value for t+2 and so on. Therefore I can't just have the true target lag and rollback variables be in the feature store, or can I? Is there something else I would have to do to be able to use the feature store setup in this situation?
Hope it makes sense, I appreciate all inputs and help ๐
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now