Warehouse ID specified in job yaml file for sql tasks

halsgbs
New Contributor III

My goal is to trigger an alert I have through a job, and it seems I have to specify the warehouse id within the job yaml file itself. We have different environments with different warehouse ids, and the issue is that if I specify the warehouse id in the job then it wont work in higher environments, since they're different. I was wondering if there is a workaround I can do here? Is it possible to parameterise the warehouse id within the job yaml file? 

bianca_unifeye
Databricks MVP

Yes, it is possible. The approach is to parameterise the warehouse_id at deploy time (not at run time) using Databricks Asset Bundles variables and per-target overrides.

https://docs.databricks.com/aws/en/dev-tools/bundles/variables

What to do (action points)

  • Introduce a bundle variable for the warehouse id (for example warehouse_id).

  • Reference that variable anywhere you currently hardcode the warehouse id in the job YAML (so the job definition resolves the correct ID during deployment).

  • Set different values per environment/target (dev/test/prod) in your bundle configuration so each deployment writes the right warehouse id for that workspace/environment.

  • In CI/CD, inject the variable value via pipeline/environment variables when deploying each target (instead of hardcoding it in the YAML).

szymon_dybczak
Esteemed Contributor III

Hi @halsgbs ,

The best practices here it to use resource reference. So if you have definition in your warehouse:

 

resources:
  sql_warehouses:
    small_warehouse:

 

Then you reference it using the bundle resource key (small_warehouse), not the display name:

 

warehouse_id: ${resources.sql_warehouses.small_warehouse.id}

 

 

Here you can find article with above approach described in more details:

DABs: Referencing Your Resources | by Hubert Dudek - Freedium

Thanks for this @szymon_dybczak ! I think this creates a new resource, I tried doing this using the details of a warehouse we already have and I'm getting this error in devops: Error: cannot create sql endpoint: failed creating warehouse: SQL warehouse with name `Starter Endpoint` already exists. 

szymon_dybczak
Esteemed Contributor III

If your resource already exists you can reference it by name using following technique (also described in an article(

szymon_dybczak_0-1767890351783.png

 

View solution in original post

halsgbs
New Contributor III

Thank you! looks like the alert_id also needs to be parametised, and I was wondering if its possible to use a job parameter to do so? If I can use the alert name then that would be great but I believe it has to be the alert id, which will be different across environments as well as different everytime the alert gets dropped and recreated. I can use a notebook to get the alert id using the alert name and then pass it on using a job parameter. Is that possible?