cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Asset Bundles Overriding Existing Jobs (despite different name_prefix)

ChristianRRL
Honored Contributor

Hi there, I'm seeing what seems to be unexpected behavior on databricks asset bundle deployment and I'm hoping I can get clarification on this.

Basically, what I'm trying to do is to deploy the same asset bundle twice (two different variations), with different names, and without the deployment of one bundle destroying the other. However, when I attempt to make changes in my asset bundle yml files that I would think would accomplish this, I am seeing only one bundle persist.

For example, if I deploy one bundle in development mode with bundle name `dag` and default name prefix `[dev crodr]` it will create that set of bundle jobs. But when I make tweaks to change the bundle name `dag_config_folder` and a different name prefix `[dev crodr config folder]`, rather than generating a separate set of bundle jobs, it currently overrides the prior set of bundle jobs.

Please let me know if I may be overlooking something here!

ChristianRRL_1-1769815680547.png

 

ChristianRRL_0-1769815600150.png

 

4 REPLIES 4

pradeep_singh
Contributor

Can you check if both your variations are using diffrent root path . If not that explains the problem . 

Override root_path to make identities unique 

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

Hi there @pradeep_singh, this is the part I'm finding confusing. I do have different root paths due to the bundle name being different. Please see below:

root_path: /Workspace/Users/${workspace.current_user.userName}/.bundle/${bundle.name}/src

 

With asset bundle deployment (A), my bundle name is simply `dag`, and so the root_path below would look like this:

root_path: /Workspace/Users/${workspace.current_user.userName}/.bundle/dag/src

 

With asset bundle deployment (B), my bundle name is `dag_config_folder`, and so the root_path below would look like this:

root_path: /Workspace/Users/${workspace.current_user.userName}/.bundle/dag_config_folder/src

 

And upon deployment, I do in fact see that the source code is deployed to the correct new path `dag_config_folder`, however, my prior asset bundle jobs are overridden, despite my specifying different information (e.g. different bundle name, and also different name_prefix). So, I'm trying to understand if I'm missing something that would ensure that when I deploy an asset bundle with new name_prefix, that the existing one is not deleted or overridden.

pradeep_singh
Contributor

Can you try specifying a diffrent name for the target as well . 

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

SteveOstrowski
Databricks Employee
Databricks Employee

Hi @ChristianRRL,

This behavior comes down to how Databricks Asset Bundles track deployed resources using Terraform state, and specifically where that state is stored locally.

HOW BUNDLE STATE TRACKING WORKS

When you run "databricks bundle deploy", the CLI uses Terraform under the hood to manage workspace resources. It maintains state in two places:

1. Remotely, at the workspace.state_path (which defaults to a subfolder under your workspace.root_path)
2. Locally, in the .bundle/<target-name>/terraform.tfstate file inside your project directory

The local state file is what maps each resource key (e.g., resources.jobs.my_job_key) to a specific workspace job ID. On subsequent deployments, Terraform reads this local state to determine which existing workspace resources to update rather than creating new ones.

WHY YOUR SECOND DEPLOYMENT OVERRIDES THE FIRST

When you change the bundle name from "dag" to "dag_config_folder" within the same local project directory and redeploy, here is what happens:

1. The remote root_path changes (because it includes ${bundle.name}), so the remote state is stored at a new location
2. However, the local .bundle/<target>/terraform.tfstate file still exists and still contains the job IDs from your first deployment
3. Terraform sees those existing job IDs in the local state, matches them to the same resource keys in your config, and updates those jobs in place rather than creating new ones
4. The old remote state directory (under .bundle/dag/) is effectively orphaned

The name_prefix only affects the display name of the job in the workspace. It does not change the resource identity as far as Terraform state tracking is concerned. The resource key in your YAML (the key under resources.jobs) is what Terraform uses to map config to state.

HOW TO FIX THIS

You have a few options depending on your use case:

OPTION 1: Use separate targets in one bundle config

Define both variations as separate targets in a single databricks.yml. Each target gets its own independent state:

targets:
dag_dev:
mode: development
default: true
presets:
name_prefix: "[dev crodr] "
workspace:
root_path: /Workspace/Users/${workspace.current_user.userName}/.bundle/${bundle.name}/dag/src

dag_config_folder_dev:
mode: development
presets:
name_prefix: "[dev crodr config folder] "
workspace:
root_path: /Workspace/Users/${workspace.current_user.userName}/.bundle/${bundle.name}/dag_config_folder/src

Then deploy each one independently:

databricks bundle deploy -t dag_dev
databricks bundle deploy -t dag_config_folder_dev

Each target has its own local state at .bundle/dag_dev/ and .bundle/dag_config_folder_dev/, so they will not interfere with each other.

OPTION 2: Use completely separate project directories

If these are truly independent bundles, keep them in separate directories, each with their own databricks.yml. This gives each bundle its own .bundle/ state directory, so deployments are fully isolated.

OPTION 3: Clear local state before switching

If you must change the bundle name in the same project, delete the local .bundle/ directory before deploying the new variation. This forces Terraform to treat everything as new resources. Be aware that this orphans the old jobs in your workspace, so you would need to manually clean those up or run "databricks bundle destroy" before making the change.

RECOMMENDED APPROACH

Option 1 (separate targets) is generally the cleanest approach. It keeps everything in one project, gives you isolated state per variation, and lets you deploy and destroy each variation independently.

DOCUMENTATION REFERENCES

- Databricks Asset Bundles overview: https://docs.databricks.com/en/dev-tools/bundles/index.html
- Deployment modes and presets: https://docs.databricks.com/en/dev-tools/bundles/deployment-modes.html
- Bundle configuration settings: https://docs.databricks.com/en/dev-tools/bundles/settings.html

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.