cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mrcity
by New Contributor II
  • 2205 Views
  • 3 replies
  • 1 kudos

Exclude absent lookup keys from dataframes made by create_training_set()

I've got data stored in feature tables, plus in a data lake. The feature tables are expected to lag the data lake by at least a little bit. I want to filter data coming out of the feature store by querying the data lake for lookup keys out of my inde...

  • 2205 Views
  • 3 replies
  • 1 kudos
Latest Reply
Quinten
New Contributor II
  • 1 kudos

I'm facing the same issue as described by @mrcity. There is no easy way to alter the dataframe, which is created inside the score_batch() function. Filtering out rows in the (sklearn) pipeline itself is also not convenient since these transformers ar...

  • 1 kudos
2 More Replies
Yashir
by New Contributor III
  • 3581 Views
  • 5 replies
  • 4 kudos

Is there a way to add Features descriptions for each of the features in a Feature Store table?

 If not, then I believe that it will be beneficial because the feature tables contain engineered features that its a good idea to document their calc logic for the benefit of other data scientists. Also, even non-engineered features are many times no...

  • 3581 Views
  • 5 replies
  • 4 kudos
Latest Reply
deep_thought
Contributor
  • 4 kudos

I also would like to see support added for feature description get/set methods.

  • 4 kudos
4 More Replies
Chengcheng
by New Contributor III
  • 1793 Views
  • 1 replies
  • 4 kudos

Is Feature Store packaged model compatible with Spark UDF?

Hi, I tried to deploy a Feature Store packaged model into Delta Live Table using mlflow.pyfunc.spark_udf in Azure Databricks. This model is built by Databricks autoML with joined Feature Table inside it.And I'm trying to make prediction using the fol...

  • 1793 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Chengcheng Guo​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 4 kudos
pcriado
by New Contributor III
  • 6022 Views
  • 2 replies
  • 1 kudos

Resolved! Requested array size exceeds VM limit when saving to feature table

Hi, I'm trying to process a small dataset (less than 300 Mb) composed by five queries that run with spark. The end result of those queries is parsed using python and merged into a data frame. Then I try to write this to a delta lake table using featu...

  • 6022 Views
  • 2 replies
  • 1 kudos
Latest Reply
pcriado
New Contributor III
  • 1 kudos

Hello, we have recently found that it's my user in particular that casues the memory issue. Two other users in my organization can run the same notebook without problems, but my user consistenly consumes all available ram and crashes the cluster... a...

  • 1 kudos
1 More Replies
lewit
by New Contributor II
  • 1720 Views
  • 2 replies
  • 1 kudos

Is it possible to create a feature store training set directly from a feature store table?

Rather than joining features from different tables, I just wanted to use a single feature store table and select some of its features, but still log the model in the feature store. The problem I am facing is that I do not know how to create the train...

  • 1720 Views
  • 2 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi, Could you please refer https://docs.databricks.com/machine-learning/feature-store/train-models-with-feature-store.html#create-a-trainingset-using-the-same-feature-multiple-times and let us know if this helps.

  • 1 kudos
1 More Replies
lawrence009
by Contributor
  • 1979 Views
  • 3 replies
  • 2 kudos

FutureWarning: ``databricks.feature_store.entities.feature_table.FeatureTable.keys`` is deprecated since v0.3.6

I'm getting this message with the following code:from databricks import feature_store   fs = feature_store.FeatureStoreClient()   fs.create_table( name='feature_store.user_login', primary_keys=['user_id'], df=df_x, description='user l...

  • 1979 Views
  • 3 replies
  • 2 kudos
Latest Reply
DavideAnghileri
Contributor
  • 2 kudos

Yes, it's a nice thing to do. You can report it here: https://community.databricks.com/s/topic/0TO3f000000CnKrGAK/bug-report and if it's more urgent or blocking for you, you can also open a ticket to the help center: https://docs.databricks.com/resou...

  • 2 kudos
2 More Replies
atul1146
by New Contributor III
  • 2236 Views
  • 2 replies
  • 5 kudos

Resolved! Databricks set up in Prod environment

Hi! can anyone please help me with a documentation which can help me set up integration between data bricks with AWS without a QuickStart default cloud formation template. I would want to use my own CFT rather than using the default due to security ...

  • 2236 Views
  • 2 replies
  • 5 kudos
Latest Reply
Pat
Honored Contributor III
  • 5 kudos

Hi @Atul S​ ,I think that terraform is recommended way to go with Databricks deployment. I mean it's also supported now by the Databricks support.I haven't look much on the CloudFormation setup, because we decided to go with the Terraform in the comp...

  • 5 kudos
1 More Replies
Nath
by New Contributor II
  • 2172 Views
  • 3 replies
  • 2 kudos

Resolved! Error with multiple FeatureLookup calls outside databricks

I access databricks feature store outside databricks with databricks-connect on my IDE pycharm.The problem is just outside Databricks, not with a notebook inside Databricks.I use FeatureLookup mecanism to pull data from Feature store tables in my cus...

  • 2172 Views
  • 3 replies
  • 2 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 2 kudos

Also, Please refer to the below KB for additional resolution - https://learn.microsoft.com/en-us/azure/databricks/kb/dev-tools/dbconnect-protoserializer-stackoverflow

  • 2 kudos
2 More Replies
Alex_G
by New Contributor II
  • 2097 Views
  • 1 replies
  • 4 kudos

Resolved! Databricks Feature Store in MLFlow run CLI command

Hello!I am attempting to move some machine learning code from a databricks notebook into a mlflow git repository. I am utilizing the databricks feature store to load features that have been processed. Currently I cannot get the databricks library to ...

  • 2097 Views
  • 1 replies
  • 4 kudos
Latest Reply
sean_owen
Databricks Employee
  • 4 kudos

Hm, what error do you get? I believe you won't be able to specify the feature store library as a dependency, as it's not externally published yet, but code that uses it should run on DB ML runtimes as it already exists there

  • 4 kudos
MoJaMa
by Databricks Employee
  • 1132 Views
  • 1 replies
  • 0 kudos
  • 1132 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

Feature table deletion is a potentially dangerous operation, since downstream consumers of feature tables (models, online stores, jobs, etc) may break due to the deletion. We might support a safe way to do this in future. In the meanwhile, we may be ...

  • 0 kudos
User16826992666
by Valued Contributor
  • 1119 Views
  • 1 replies
  • 0 kudos

Resolved! If I create a Feature Store, how is the underlying data actually saved?

And do I have any control over where and how it's saved?

  • 1119 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

The offline store is backed by Delta tables . In AWS we support Amazon Aurora (MySQL-compatible) & Amazon RDS MySQL and in Azure we support Azure Database for MySQL and Azure SQL Database as as online stores https://docs.microsoft.com/en-us/azure/d...

  • 0 kudos
Labels