cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks workflows, sample script/method to deploy jobs.json to other workspace

Databricks_-Dat
New Contributor II

Could someone point me at right direction to deploy Jobs from one workspace to other workspace using josn file in Devops CI/CD pipeline? Thanks in advance.

3 REPLIES 3

yuvapraveen_k
New Contributor III

Hi, 

Please use databricks jobs API to extract the workflow definitions into the GIT repository as part of your checkin process before the PR. In your devops pipeline, you can submit the parameters as needed from the pipeline and deploy it using databricks deploy activity or using API to your higher environment. In the process, you can also substitute a different cluster settings during deployment. In many cases, the cluster settings in the production environment will higher than the development environment since it will handle more data.

Thanks for the quick reply. It sounds workable for us as we are using devops deployment piepline. But I can't find an automated way to link workflow definitions to Git automatically. Does it have to be manually uploaded to Repos to checkin to GIT? If possible, could you please guide on how deployment pipeline could create the workflow in destination workspace? is it though running a python script which would have create json  with updated parameters? As you said, I would need to change pool ID, Cluster Config etc.

Thanks for your help!

yuvapraveen_k
New Contributor III

Your are welcome. There was a feature that databricks released to linked the workflow definition to the GIT automatically. Please refer the link below,

https://www.databricks.com/blog/2022/06/21/build-reliable-production-data-and-ml-pipelines-with-git-...

In terms of the flow, below are the steps you should be following.

Step 1: Link workflow with the GIT repository with the databricks feature. If that feature is still not available, you can have a very simple python or a shell script (make only one API call) installed in their local using this documentation https://docs.databricks.com/workflows/jobs/jobs-2.0-api.html (This is the method we follow since we implemented git integration even before databricks added the feature.

Step 2: Developer checks-in the code. Peer approves the the PR.

Step 3: Devops pipeline can pick the workflow config from the git folder, either 1) perform a replace in the azure devops pipeline using a string replace and deploy the databricks workflow in your higher enviornment using the task or an API call. 2) Terraform task can also be used to deploy a workflow which may be little bit easier than option 1.

Hope that helps.