01-28-2022 05:57 AM
I have created a feature table (Databricks runtime ML 10.2) that includes a timestamp column as a primary key, that is not used as a feature but as a column to join on.
I have then created a model that trains from this feature table and some additional data, which excludes the primary keys. I tried excluding them, both using the feature store api, and from the sklearn api. The model is being trained fine, but when use the score_batch() method, I get an error claiming that 'TypeError: float() argument must be a string or a number, not 'Timestamp''.
This error is coming from sklearn, so is there some incompatibility there, or is this a bug in feature store?
Steps to reproduce :
01-28-2022 07:15 AM
01-28-2022 07:15 AM
maybe you can just try to cast timestamp as int
01-29-2022 05:20 AM
Thanks for your reply Hubert. Yes, casting it to long or int does solve the issue, but it is a workaround and I would like to keep the data as-is, with directly interpretable timestamps, especially when there is no reason why they should trigger an error during the prediction step since it is not being used at that stage.
03-11-2022 01:39 AM
Hi @Thibault Daoulas , Databricks released runtime ML 10.2 in December 2021. Here are the important improvisations. You can also refer to the documentation here.
Databricks Runtime ML includes AutoML, a tool to automatically train machine learning pipelines.
The FeatureStoreClient interface has been simplified.
For more information, see Work with feature tables and Databricks Feature Store Python API.
03-14-2022 05:06 PM
Hi @Thibault Daoulas ,
Did @Kaniz Fatma response help you to resolved your question? if yes, please mark it as best response. If not, please let us know.
03-15-2022 12:56 AM
Hi, it did not, but at least I know they are not fully supported so a workaround is to avoid timestamps, so I suppose you can mark this as resolved
03-15-2022 11:34 PM
Thank you @Thibault Daoulas for the update. Can you mark one of the answers whichever you feel is the best?
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group