Hi
Question probably around best practices, but curious if someone else has dealt with a similar situation. I have 2 Databricks workspaces - one for Dev and one for Prod. Had to be two workspaces because Azure Landing Zones had to be air gapped from each other. My workspaces have a few DLT pipelines and their associated jobs with schedules.
I've set up deployments through Asset Bundles and all that works splendidly. Now a requirement is to have some pipelines running only DEV and some running only on PROD. In my databricks.yml, I currently just use a single variable that I control for each environment on my deployment pipeline, but this applies to all jobs in that environment:
variables:
pauseStatus:
description: Is pipeline "PAUSED"/"UNPAUSED"
Which I refer to in each of my jobs' YAML as:
schedule:
quartz_cron_expression: 0 0 7 * * ?
pause_status: ${var.pauseStatus}
Which is then subsequently referred to in my deployment step as:
databricks bundle deploy -t ${{variables.target}} -p DEFAULT
--var="pauseStatus=${{variables.pauseStatus}}"
Only method I can think of is multiple variables/parameters for each job with it's own statuses that I pass down to the Asset Bundles deployment step. Which can quickly grow out of control:
databricks bundle deploy -t ${{variables.target}} -p DEFAULT
--var="pauseStatusJob1=${{variables.pauseStatusJob1}}"
--var="pauseStatusJob1=${{variables.pauseStatusJob2}}"
--var="pauseStatusJob1=${{variables.pauseStatusJob3}}"
Is there a more intuitive and scalable way of achieving granular control over job pause statuses across different environments?
Thanks!