cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Asset Bundles: Shared libraries and notebooks in monorepo multi-bundle setup

auso
New Contributor

I am part of a small team of Data Engineers which started using Databricks Asset Bundles one year ago. Our code base consists of typical ETL-workloads written primarily in Jupyter notebooks (.ipynb), and jobs (.yaml) with our codebase spanning across a large number of different business domains.

Currently, our code base consists of a single monorepo with one large bundle containing all our notebooks, jobs, libraries etc.

Our code base has grown to a size where we see the need to split our single bundle into several smaller bundles - one for each business domain.

We are envisioning a setup similar to the following (simplified) structure:

monorepo/
│
├── shared_notebooks/
├── shared_libraries/
├── variables.yml
│
├── Bundle_A/
│   ├── resources/
│   ├── src/
│   └── databricks.yml
│
└── Bundle_B/
    ├── resources/
    ├── src/
    └── databricks.yml

Where the repo contains some shared notebooks and libraries which may be used in all bundles in our repository.

Does anyone have some suggestions for how this should be implemented?

  1. How can we "import" shared assets (notebooks, libraries and variables) into our bundles?
  2. Does our approach to splitting up our mono-bundle repository seem sensible?

Thanks in advance for your insights!

Kaspar Hauser

0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now