cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Asset Bundles pause_status Across Different Environments

RangaSarangan
New Contributor

Hi

Question probably around best practices, but curious if someone else has dealt with a similar situation. I have 2 Databricks workspaces - one for Dev and one for Prod. Had to be two workspaces because Azure Landing Zones had to be air gapped from each other. My workspaces have a few DLT pipelines and their associated jobs with schedules.

I've set up deployments through Asset Bundles and all that works splendidly. Now a requirement is to have some pipelines running only DEV and some running only on PROD. In my databricks.yml, I currently just use a single variable that I control for each environment on my deployment pipeline, but this applies to all jobs in that environment:

 

variables:
  pauseStatus:
    description: Is pipeline "PAUSED"/"UNPAUSED"

 

Which I refer to in each of my jobs' YAML as:

 

schedule:
  quartz_cron_expression: 0 0 7 * * ?
  pause_status: ${var.pauseStatus}

 

Which is then subsequently referred to in my deployment step as:

 

databricks bundle deploy -t ${{variables.target}} -p DEFAULT 
--var="pauseStatus=${{variables.pauseStatus}}"

 

Only method I can think of is multiple variables/parameters for each job with it's own statuses that I pass down to the Asset Bundles deployment step. Which can quickly grow out of control:

 

databricks bundle deploy -t ${{variables.target}} -p DEFAULT 
--var="pauseStatusJob1=${{variables.pauseStatusJob1}}"
--var="pauseStatusJob1=${{variables.pauseStatusJob2}}"
--var="pauseStatusJob1=${{variables.pauseStatusJob3}}"

 

Is there a more intuitive and scalable way of achieving granular control over job pause statuses across different environments?

Thanks!

1 REPLY 1

Ajay-Pandey
Esteemed Contributor III

Hi @RangaSarangan ,

We have faced same issue and solved using databricks workflow API and json file for job metadata that consist job and thier respective status for each env.

You can create azure devops that run after your cicd pipeline and change the job status accordingly

Ajay Kumar Pandey

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group