2 weeks ago
Hey,
I am new to Databricks, and I am trying to test the mlops-stack bundle.
Within that bundle there is a feature-engineering workflow and I have a problem to make it run.
The main problem is the following.
the bundle specified the target to be $bundle.target which is in my case would be dev. I have created the dev catalog and within the project schema according to the template.
The issue is that when I run the workflow, the notebook fails at
from databricks.feature_engineering import FeatureEngineeringClient
fe = FeatureEngineeringClient()
# Create the feature table if it does not exist first.
# Note that this is a no-op if a table with the same name and schema already exists.
fe.create_table(
name=output_table_name,
primary_keys=[x.strip() for x in pk_columns.split(",")] + [ts_column], # Include timeseries column in primary_keys
timestamp_keys=[ts_column],
df=features_df,
)
# Write the computed features dataframe.
fe.write_table(
name=output_table_name,
df=features_df,
mode="merge",
)
I am getting that:
ValueError: Catalog 'dev' does not exist in the metastore.
And I don't understand why?. If I ran the notebook through my own cluster.
I tried to give all privileges to all users in the workspace, but it did not help.
2 weeks ago
The dev you mention in the bundle target is different from the dev catalog.
What is the value of "output_table_name". If its a 3 namespace value catalog_name.db_name.table_name please make sure you have write access to that catalog and dbname.
Read more here
https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html
2 weeks ago
Hi @gchandra
Thanks for the answer. the output table name is what is inside the databricks templated mlops-stack.
Regarding the $bundle.target
according to your(Databricks commented) databricks.yml:
2 weeks ago
Apologies, I misread your question.
Can you please share your databricks.yml file or the URL you followed?
2 weeks ago
This is the mlops-stack which I am trying to follow.
https://github.com/databricks/mlops-stacks/tree/main/template/%7B%7B.input_root_dir%7D%7D/%7B%7Btemp...
I instantiated it by:
I am first try to test all project related workflow in dev, and how they are interacts, and later I want to test with CICD, so I can deploy across 3 different workspaces (dev/staging/prod).
2 weeks ago
When I execute the notebook through databricks bundle run -t dev write_feature_table_job
I printed the available catalogs and I only got spark_catalog.
2 weeks ago
Is your workspace UC enabled?
2 weeks ago
Yes it is.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group