โ10-17-2024 04:13 AM
Hey,
I am new to Databricks, and I am trying to test the mlops-stack bundle.
Within that bundle there is a feature-engineering workflow and I have a problem to make it run.
The main problem is the following.
the bundle specified the target to be $bundle.target which is in my case would be dev. I have created the dev catalog and within the project schema according to the template.
The issue is that when I run the workflow, the notebook fails at
from databricks.feature_engineering import FeatureEngineeringClient
fe = FeatureEngineeringClient()
# Create the feature table if it does not exist first.
# Note that this is a no-op if a table with the same name and schema already exists.
fe.create_table(
name=output_table_name,
primary_keys=[x.strip() for x in pk_columns.split(",")] + [ts_column], # Include timeseries column in primary_keys
timestamp_keys=[ts_column],
df=features_df,
)
# Write the computed features dataframe.
fe.write_table(
name=output_table_name,
df=features_df,
mode="merge",
)
I am getting that:
ValueError: Catalog 'dev' does not exist in the metastore.
And I don't understand why?. If I ran the notebook through my own cluster.
I tried to give all privileges to all users in the workspace, but it did not help.
โ10-17-2024 04:33 AM
The dev you mention in the bundle target is different from the dev catalog.
What is the value of "output_table_name". If its a 3 namespace value catalog_name.db_name.table_name please make sure you have write access to that catalog and dbname.
Read more here
https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html
โ10-17-2024 04:38 AM
Hi @gchandra
Thanks for the answer. the output table name is what is inside the databricks templated mlops-stack.
Regarding the $bundle.target
according to your(Databricks commented) databricks.yml:
โ10-17-2024 04:49 AM
Apologies, I misread your question.
Can you please share your databricks.yml file or the URL you followed?
โ10-17-2024 04:54 AM
This is the mlops-stack which I am trying to follow.
https://github.com/databricks/mlops-stacks/tree/main/template/%7B%7B.input_root_dir%7D%7D/%7B%7Btemp...
I instantiated it by:
I am first try to test all project related workflow in dev, and how they are interacts, and later I want to test with CICD, so I can deploy across 3 different workspaces (dev/staging/prod).
โ10-17-2024 06:18 AM
When I execute the notebook through databricks bundle run -t dev write_feature_table_job
I printed the available catalogs and I only got spark_catalog.
โ10-17-2024 06:25 AM
Is your workspace UC enabled?
โ10-17-2024 06:43 AM
Yes it is.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group