Databricks Community

Aria · ‎08-01-2024

We are using databricks asset bundle for code deployment and biggest issue I am facing is that policy_id is different in each environment.I tried with environment variable sin azure devops and also with declaring the variables in databricks.yaml and then used it in resources folder. However nothing worked till now.

@policy_id @DAB

maikl · ‎11-07-2024

Hi Aria,

did you solve the issue? or did you use any workaround?

Thank you.

-werners- · ‎11-07-2024

I use it and it works (kinda).
What you have to do is define a variable for the compute policy in the variables section (databricks.yml).
In the target section (also databricks.yml) you set the policy ID per environment (dev, prod,...).
Then in your job yaml, you need to call the variable in the cluster definition in the new_cluster section using policy_id:
policy_id: ${var.policy}
Like that the policy will be used for cluster creation.
Mind that you still have to pass certain values even though the policy already contains them (spark_version f.e.)

maikl · ‎11-07-2024

When you define the policy_id in the target block then the variable doesn't make sense. For me works these:

I solved it for now, to override the value for policy_id in the target part:

targets:

dev:

workspace:

host: https://databricks-workspace-a.azuredatabricks.net

root_path: ~/DATABRICKS_BUNDLES

resources:

jobs:

read_data_lake:

name: read_data_lake

job_clusters:

- job_cluster_key: Job_cluster

new_cluster:

policy_id: <POLICY_ID from databricks-workspace-a>

tst:

workspace:

host: https://databricks-workspace-b.azuredatabricks.net

root_path: ~/DATABRICKS_BUNDLES

resources:

jobs:

read_data_lake:

name: read_data_lake

job_clusters:

- job_cluster_key: Job_cluster

new_cluster:

policy_id: <POLICY_ID from databricks-workspace-b>

I think it's useful for few jobs. When you have more then It's useless, because you must define each job in the target block to use your policy_id. If I missed something or am wrong, please let me know 🙂

-werners- · ‎11-08-2024

Variables are useful but it depends on how you set up the bundles.

I define a policy per target. Since 'policy' is not known to the target, I create a variable and assign it a different value depending on the environment.
This variable is then used in the jobs, which reside in another file.

I keep environment definition and job definition separated so it is easier to promote from dev to prod