Asset Bundles: Shared libraries and notebooks in monorepo multi-bundle setup
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-15-2025 07:53 AM
I am part of a small team of Data Engineers which started using Databricks Asset Bundles one year ago. Our code base consists of typical ETL-workloads written primarily in Jupyter notebooks (.ipynb), and jobs (.yaml) with our codebase spanning across a large number of different business domains.
Currently, our code base consists of a single monorepo with one large bundle containing all our notebooks, jobs, libraries etc.
Our code base has grown to a size where we see the need to split our single bundle into several smaller bundles - one for each business domain.
We are envisioning a setup similar to the following (simplified) structure:
monorepo/
│
├── shared_notebooks/
├── shared_libraries/
├── variables.yml
│
├── Bundle_A/
│ ├── resources/
│ ├── src/
│ └── databricks.yml
│
└── Bundle_B/
├── resources/
├── src/
└── databricks.ymlWhere the repo contains some shared notebooks and libraries which may be used in all bundles in our repository.
Does anyone have some suggestions for how this should be implemented?
- How can we "import" shared assets (notebooks, libraries and variables) into our bundles?
- Does our approach to splitting up our mono-bundle repository seem sensible?
Thanks in advance for your insights!
Kaspar Hauser