cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How can I deploy workflow jobs to another databricks workspace?

Dean_Lovelace
New Contributor III

I have created a number of workflows in the Databricks UI.

I now need to deploy them to a different workspace.

How can I do that?

Code can be deployed via Git, but the job definitions are stored in the workspace only.

13 REPLIES 13

karthik_p
Esteemed Contributor

@Dean Lovelace​ If you are using Git setting config with git repo, you can use same repo in further environments and modify cluster related to new environment that you deployed , please check below article that should provide some info

https://docs.databricks.com/repos/ci-cd-techniques-with-repos.html

I can't see anything in this link related to moving jobs (schedules) between environments. I have git integration, but workflow jobs are not stored in git.

Anonymous
Not applicable

Hi @Dean Lovelace​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Dean_Lovelace
New Contributor III

The only way I can find to move workflow jobs (schedules) to another workspace is:-

1) Create a job in the databricks UI (Workflows -> Jobs -> Create Job)

2) Copy the json definition ("...", View JSON, Create, Copy)

3) Save the json locally or in the Git Repo

4) Create the job in a different workspace (amending the cluster id) with the Databricks CLI command "databricks jobs create --json-file ***.json".

Is this the only way?

karthik_p
Esteemed Contributor

@Dean Lovelace​ Instead of all this, directly integrate your workflow with git and use same in further environments and update cluster.

The workflows are 'integrated' with git, as in they get the code from a git branch each time they run. I am not seeing an option to store the actual workflow definitions in git (task name, script path, dependencies, retries, schedule etc). I have a workflow with multiple tasks, and the dependencies need to be fully tested prior to a production deployment.

marct
New Contributor II

Hi @Dean_Lovelace,

did you find a solution that you can share?

Thanks in advance

 

Mumrel
Contributor

I have the same question:

Of course you can do everything with terraform, but that forces development from pipelines as code instead of a visual one. If you start out with terraform, each deployment is done the same way and straightforward, but is in code instead of visual editor. So this is a steep learning curve.

I think getting terraform code from an exiting pipeline would be great, but I am also not sure if that works. Maybe one could build the same an utilize the db cli, but this all seems a lot of effort for something that many projects need.

I have seen projects doing it for Azure Data Factory like this: Edit Pipelines in UI, get Json description, of pipeline, replace/insert environment parameters, deploy arm script to different environment. I am also looking for the most native Databricks way

Here is a hint how one could custom build it, I guess.

Job Json

Radhikad
New Contributor II

Hello everyone, I need the same help from databricks expert. I have created a 'Job1' job with runtime 12.2 in 'Datbricks1' workspace. I have integrated with Azure repo and tried deploying in 'ENV1' using CI/CD pipeline. It is successfully deployed in ENV1 but not on ENV2 of Databrick2 with same runtime 12.2. I mean, the jobs details are not reflecting in another workspace. Could anyone please please help.

cpradeep
New Contributor III

@Dean_Lovelace 

did you implement the solution ? Please share how you implemented CI/CD for workflow?

itacdonev
New Contributor

It is not a seamless deploy but it worked for a one-shot transfer.

Here is what worked for me:

  1. go to the original workflow and click "switch to code version (YAML)"
  2. select all and copy
  3. go to the new workspace and click create new job
  4. click "switch to code version (YAML)" and click paste
  5. in the new workspace go to the compute and select the compute you wish to use. Then clik on the View JSON and copy the compute ID
  6. got back to the new job in the new workspace and paste this compute ID where ever you have "existing_cluster_id": XXXXXX (replace the XXX with the new id)
  7. click save and view in visual mode
  8. adjust as necessary 

Walter_C
Databricks Employee
Databricks Employee

@itacdonev great option provided, @Dean_Lovelace you can also select the option View JSON on the Workflow and move to the option create, with this code you can use the API https://docs.databricks.com/api/workspace/jobs/create and create the job in the desire workspace following the same job code including the same cluster configuration if using job cluster

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group