cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks workflows, sample script/method to deploy jobs.json to other workspace

Databricks_-Dat
New Contributor II

Could someone point me at right direction to deploy Jobs from one workspace to other workspace using josn file in Devops CI/CD pipeline? Thanks in advance.

3 REPLIES 3

yuvapraveen_k
New Contributor III

Hi, 

Please use databricks jobs API to extract the workflow definitions into the GIT repository as part of your checkin process before the PR. In your devops pipeline, you can submit the parameters as needed from the pipeline and deploy it using databricks deploy activity or using API to your higher environment. In the process, you can also substitute a different cluster settings during deployment. In many cases, the cluster settings in the production environment will higher than the development environment since it will handle more data.

Thanks for the quick reply. It sounds workable for us as we are using devops deployment piepline. But I can't find an automated way to link workflow definitions to Git automatically. Does it have to be manually uploaded to Repos to checkin to GIT? If possible, could you please guide on how deployment pipeline could create the workflow in destination workspace? is it though running a python script which would have create json  with updated parameters? As you said, I would need to change pool ID, Cluster Config etc.

Thanks for your help!

yuvapraveen_k
New Contributor III

Your are welcome. There was a feature that databricks released to linked the workflow definition to the GIT automatically. Please refer the link below,

https://www.databricks.com/blog/2022/06/21/build-reliable-production-data-and-ml-pipelines-with-git-...

In terms of the flow, below are the steps you should be following.

Step 1: Link workflow with the GIT repository with the databricks feature. If that feature is still not available, you can have a very simple python or a shell script (make only one API call) installed in their local using this documentation https://docs.databricks.com/workflows/jobs/jobs-2.0-api.html (This is the method we follow since we implemented git integration even before databricks added the feature.

Step 2: Developer checks-in the code. Peer approves the the PR.

Step 3: Devops pipeline can pick the workflow config from the git folder, either 1) perform a replace in the azure devops pipeline using a string replace and deploy the databricks workflow in your higher enviornment using the task or an API call. 2) Terraform task can also be used to deploy a workflow which may be little bit easier than option 1.

Hope that helps.

 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!