Databricks Community

-werners- · ‎10-08-2024

Did anyone succeed in using already existing compute policies (created using the UI) in asset bundles for creating a job?
I defined the policy_id in the resources/job yml for the job_cluster, but when deploying I get errors saying spark version is not defined (this is defined in the policy), or other missing parameters (all defined in the policy).
So it seems that the policy is not fetched or applied.

-werners- · ‎10-09-2024

So I figured it out.
You can actually refer existing cluster policies, but I made the mistake thinking all cluster config was added automatically by doing that.
In fact you still have to add some cluster config in the resources yaml:

- spark_version
- spark_conf + custom_tags (for singlenode clusters, see link Szymon posted)
- node_type_id + driver_type_id
When adding those in the yaml, deployment was possible.

I don't know why it works like it does, perhaps it is linked to the policy definition (f.e. optional attributes in the policy), but it would be nice if there was documentation on the requirements here.

View solution in original post

szymon_dybczak · ‎10-08-2024

Hi @-werners- ,

I think you ran into the same kind of issue as the others in below discussion. There is some ongoing issue with TF provider, you can take a look at github thread:

DAB deployment fails with `Error: cannot create job: NumWorkers could be 0 only for SingleNode clust...

-werners- · ‎10-08-2024

It looks like it, but I also get errors on non singlenode clusters.
But there might be an underlying issue with policy settings not being applied.
Tnx for the link though.

-werners- · ‎10-09-2024

So I figured it out.
You can actually refer existing cluster policies, but I made the mistake thinking all cluster config was added automatically by doing that.
In fact you still have to add some cluster config in the resources yaml:

- spark_version
- spark_conf + custom_tags (for singlenode clusters, see link Szymon posted)
- node_type_id + driver_type_id
When adding those in the yaml, deployment was possible.

I don't know why it works like it does, perhaps it is linked to the policy definition (f.e. optional attributes in the policy), but it would be nice if there was documentation on the requirements here.

Databricks Community

asset bundles and compute policies

Connect with Databricks Users in Your Area

Jumpstart Your Data Journey with Databricks Get Started Days!

Databricks DevConnect: Global Community Meetups for Data Engineers

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Introducing SAP Databricks

Databricks Clean Rooms: Now Generally Available on AWS and Azure