cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Updating projects created from Databricks Asset Bundles

Sleiny
New Contributor

Hi all

We are using Databricks Asset Bundles for our data science / ML projects. The asset bundle we have, have spawned quite a few projects by now, but now we need to make some updates to the asset bundle. The updates should also be added to the spawned projects.

So my question is how to handle this? Is there a feature integrated with Databricks asset bundles or would we need to look in a different direction?

I know there are some tools which are compatible with cookiecutter templates, where you can update the cookiecutter template and then apply the changes on the spawn. However, i cant seem to find something that would make sense from a databricks asset bundle perspective. I think it is quite an issue honestly.

kind regards

Niels 

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Greetings @Sleiny , 

Here’s what’s really going on, plus a pragmatic, field-tested plan you can actually execute without tearing up your repo strategy.

Let’s dig in.

What’s happening

Databricks Asset Bundles templates are used at initialization time via databricks bundle init—either from default templates or from your own custom ones. They’re great for standardizing how projects start. The key detail is that templates are explicitly positioned as one-time scaffolding. The docs cover how to create them, share them, and initialize bundles from them—but there is no built-in mechanism to “re-apply” template changes back onto projects that were already spawned.

Once a bundle exists, it’s just configuration in source control—typically YAML (or Python), with databricks.yml at the root. You can compose that configuration using include, along with other top-level constructs like git metadata, scripts, and sync behavior. This makes bundles modular, but again, that modularity is your responsibility to design.

For shared logic across many projects, the right abstraction is not copy-paste—it’s centrally versioned libraries. Wheels, JARs, and PyPI packages can be referenced directly in bundle job tasks so that “common code” lives in one place instead of being scattered across a dozen repos.

Bundles also give you workflows like bundle generate and bundle deployment bind, but those are about keeping a single project’s local configuration aligned with what’s already deployed. They are not designed to propagate template evolution across multiple projects.

Implications

There is no native “cookiecutter update” equivalent for Databricks Asset Bundles. Template updates do not automatically fan out to existing projects. That said, bundles are git-first and composable, which means you can implement clean, scalable patterns that solve this problem in a very software-engineering-native way.

Recommended action plan

  1. Separate shared code from per-project code using central libraries

    Move shared DS/ML logic—training loops, utilities, feature engineering, common jobs—into one or more versioned packages. Publish those as wheels or JARs into Unity Catalog volumes or workspace files. Then each bundle simply references the shared artifact. Updating behavior becomes a version bump plus a redeploy, not a repo-wide refactor.

Once you do this, drift essentially disappears at the code layer.

  1. Externalize shared bundle configuration and include it

    Create a small “org-bundle-base” repo that holds your standard YAML fragments: compute presets, job conventions, cluster policies, security defaults, tagging, all of it. In each project, use include in databricks.yml to pull those fragments in. Manage that shared repo via a Git submodule or a pinned vendor path.

Now you have one place to edit common configuration—and updating a project becomes a simple submodule bump, validate, and deploy.

  1. If your projects came from a template repo, use an upstream merge model

    If your bundles were initialized from a Git-hosted template, treat those projects like forks. Add the template repo as an upstream remote and periodically merge or rebase in changes. The docs allow templates to come from Git URLs, but they don’t establish a persistent linkage—this upstream model is how you operationalize template evolution in practice.

  2. Automate propagation when scale kicks in

    If you’re managing dozens of repos, you don’t want humans doing this manually. Script it.

Automate submodule bumps, shared package version updates, and config edits via CI (GitHub Actions, Azure DevOps, etc.). Have those workflows open PRs across repos. Gate everything with databricks bundle validate and lightweight smoke runs in dev before anything hits prod. This keeps your fleet consistent without centralizing everything into one monorepo.

  1. Use generate/bind for drift control—not for template sync

    When you’re updating definitions for jobs and pipelines that already exist in a workspace, bundle generate and bundle deployment bind are your safety rails. They help keep each project’s local state aligned with deployed reality while you’re rolling out broader repo-level changes. Think of this as drift control, not template propagation.

Suggested next steps

First, inventory your changes and cleanly separate shared code from shared configuration.

Second, stand up a central shared-library pipeline and migrate projects to consume versioned artifacts.

Third, create your org-level base bundle repo and wire it in with include.

Finally, automate the update flow so this becomes routine instead of a quarterly fire drill.

Bottom line

There is no built-in Databricks Asset Bundles feature that automatically re-applies template changes to existing projects. The right solution is git-native: shared base repos via include, upstream merges for template-derived projects, and centrally versioned wheels and JARs for shared logic. Once you adopt those patterns, rolling updates becomes predictable, low-touch, and safe.

Hope this helps, Louis.