cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Warehouse ID specified in job yaml file for sql tasks

halsgbs
New Contributor

My goal is to trigger an alert I have through a job, and it seems I have to specify the warehouse id within the job yaml file itself. We have different environments with different warehouse ids, and the issue is that if I specify the warehouse id in the job then it wont work in higher environments, since they're different. I was wondering if there is a workaround I can do here? Is it possible to parameterise the warehouse id within the job yaml file? 

1 ACCEPTED SOLUTION

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

If your resource already exists you can reference it by name using following technique (also described in an article(

szymon_dybczak_0-1767890351783.png

 

View solution in original post

5 REPLIES 5

bianca_unifeye
Contributor

Yes, it is possible. The approach is to parameterise the warehouse_id at deploy time (not at run time) using Databricks Asset Bundles variables and per-target overrides.

https://docs.databricks.com/aws/en/dev-tools/bundles/variables

What to do (action points)

  • Introduce a bundle variable for the warehouse id (for example warehouse_id).

  • Reference that variable anywhere you currently hardcode the warehouse id in the job YAML (so the job definition resolves the correct ID during deployment).

  • Set different values per environment/target (dev/test/prod) in your bundle configuration so each deployment writes the right warehouse id for that workspace/environment.

  • In CI/CD, inject the variable value via pipeline/environment variables when deploying each target (instead of hardcoding it in the YAML).

szymon_dybczak
Esteemed Contributor III

Hi @halsgbs ,

The best practices here it to use resource reference. So if you have definition in your warehouse:

 

resources:
  sql_warehouses:
    small_warehouse:

 

Then you reference it using the bundle resource key (small_warehouse), not the display name:

 

warehouse_id: ${resources.sql_warehouses.small_warehouse.id}

 

 

Here you can find article with above approach described in more details:

DABs: Referencing Your Resources | by Hubert Dudek - Freedium

Thanks for this @szymon_dybczak ! I think this creates a new resource, I tried doing this using the details of a warehouse we already have and I'm getting this error in devops: Error: cannot create sql endpoint: failed creating warehouse: SQL warehouse with name `Starter Endpoint` already exists. 

szymon_dybczak
Esteemed Contributor III

If your resource already exists you can reference it by name using following technique (also described in an article(

szymon_dybczak_0-1767890351783.png

 

halsgbs
New Contributor

Thank you! looks like the alert_id also needs to be parametised, and I was wondering if its possible to use a job parameter to do so? If I can use the alert name then that would be great but I believe it has to be the alert id, which will be different across environments as well as different everytime the alert gets dropped and recreated. I can use a notebook to get the alert id using the alert name and then pass it on using a job parameter. Is that possible?