cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Run multiple jobs with different source code at the same time with Databricks asset bundles

curiousoctopus
New Contributor III

Hi,

I am migrating from dbx to databricks asset bundles. Previously with dbx I could work on different features in separate branches and launch jobs without issue of one job overwritting the other. Now with databricks asset bundles it seems like I can't since it's deploying/updating ONE job and running an instance of the latest.

This is what I have in my `databricks.yml` to deploy my job:

resources:
  jobs:
    <my-job>:
      name: my-job-${var.suffix}
      tasks:
        - ...

 I thought I could use a custom variable (here suffix) to create multiple jobs with the feature name as a suffix for example so that everyone working on different features could run their experiments. However it just changed the name of the job previously deployed. I also tried using the custom variable within the key <my-job> but it wasn't allowed.

 

So my question is how can I achieve this? Ultimately I want to be able to work on a different feature than my colleagues and not have to coordinate when I can launch my job to not overwrite theirs.

 

Thanks you.

4 REPLIES 4

Hi Kaniz,

Thank you for your answer and the time taken. Unfortunately this is not an acceptable solution for me as for every feature we would developed we would have to create a new job within the `databricks.yml` file. This is too much of a hassle and ultimately not the goal of CI/CD pipelines.

dbx uses an asset-based approach to allow testing new features without overwriting the current job definition. The use cases mentioned are exactly what we are looking for in dab (also see in their documentation😞

  • You want to update or change job definitions only when you release the job
  • Multiple users working in parallel on the same job (e.g. in CI pipelines)

Does dab offer a similar feature? And if not, is it planned to do so? As this is a considerable issue for my team, we are considering not switching to dab and keep dbx instead.

Thank you.

dattomo1893
New Contributor II

Any updates here?

My team is migrating from dbx to DABs and we are running into the same issue. Ideally, we would like to deploy multiple, parametrized jobs from a single bundle. If this is not possible, we have to keep dbx.

Thank you!

Starki
New Contributor III

We have exactly the same issue here.

@Retired_mod, any information if DABs will ever support dbx style assets-only deployments? 
Alternatively deployments with parameterized job names that do not overwrite the existing job like the OP described?

Thanks!

mo_moattar
New Contributor III

We have the same issue. We might have multiple open PR on the bundles that are deploying the code, pipelines, jobs, etc. to the same workspace before the merge and they keep overwriting each other in the workspace.

The jobs already have a separate ID (backend ID) assigned to them so I don't know why the yaml tag has been used as the unique identifier of the job in the bundle

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group