Databricks Community

curiousoctopus · ‎04-08-2024

Hi,

I am migrating from dbx to databricks asset bundles. Previously with dbx I could work on different features in separate branches and launch jobs without issue of one job overwritting the other. Now with databricks asset bundles it seems like I can't since it's deploying/updating ONE job and running an instance of the latest.

This is what I have in my `databricks.yml` to deploy my job:

resources:
  jobs:
    <my-job>:
      name: my-job-${var.suffix}
      tasks:
        - ...

I thought I could use a custom variable (here suffix) to create multiple jobs with the feature name as a suffix for example so that everyone working on different features could run their experiments. However it just changed the name of the job previously deployed. I also tried using the custom variable within the key <my-job> but it wasn't allowed.

So my question is how can I achieve this? Ultimately I want to be able to work on a different feature than my colleagues and not have to coordinate when I can launch my job to not overwrite theirs.

Thanks you.

curiousoctopus · ‎04-10-2024

Hi Kaniz,

Thank you for your answer and the time taken. Unfortunately this is not an acceptable solution for me as for every feature we would developed we would have to create a new job within the `databricks.yml` file. This is too much of a hassle and ultimately not the goal of CI/CD pipelines.

dbx uses an asset-based approach to allow testing new features without overwriting the current job definition. The use cases mentioned are exactly what we are looking for in dab (also see in their documentation😞

You want to update or change job definitions only when you release the job
Multiple users working in parallel on the same job (e.g. in CI pipelines)

Does dab offer a similar feature? And if not, is it planned to do so? As this is a considerable issue for my team, we are considering not switching to dab and keep dbx instead.

Thank you.

dattomo1893 · ‎05-02-2024

Any updates here?

My team is migrating from dbx to DABs and we are running into the same issue. Ideally, we would like to deploy multiple, parametrized jobs from a single bundle. If this is not possible, we have to keep dbx.

Thank you!

Starki · ‎07-11-2024

We have exactly the same issue here.

@Retired_mod, any information if DABs will ever support dbx style assets-only deployments?
Alternatively deployments with parameterized job names that do not overwrite the existing job like the OP described?

Thanks!

mo_moattar · ‎08-05-2024

We have the same issue. We might have multiple open PR on the bundles that are deploying the code, pipelines, jobs, etc. to the same workspace before the merge and they keep overwriting each other in the workspace.

The jobs already have a separate ID (backend ID) assigned to them so I don't know why the yaml tag has been used as the unique identifier of the job in the bundle

Databricks Community

Run multiple jobs with different source code at the same time with Databricks asset bundles

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences