cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks

mbanxp
New Contributor III

Hi there !!! I would like to find the most suitable orchestration process to promote data between medallion layers I need to solve the following key architectural decision for scaling my multi-tenant data lake in Databricks.

My setup:

  • Independent medallion architecture per client (Landing โ†’ Bronze โ†’ Silver โ†’ Gold per client)
  • Identical schema across all clients (same data model)
  • Multiple tables per layer (each with specific transformations)

What would be the best approach in Databricks to orchestrate the data promotion between layers ?

  • Independent pipelines per client for all tables
  • Independent pipelines per client and table
  • Independent pipelines per table for all clients

Thanks in advance.

2 REPLIES 2

sarahbhord
Databricks Employee
Databricks Employee

Hey mbanxp!

The most scalable and maintainable orchestration pattern for multi-tenant medallion architectures in Databricks is to build independent pipelines per table for all clients, with each pipeline parameterized by client/tenant.

Why this approach?

  • Centralizes business logic for each table (reduces code duplication).
  • Makes onboarding new clients easyโ€”just add configuration, don't duplicate pipeline code.
  • Scales well as data and client count grow.
  • Fits perfectly with Databricks Workflows and Delta Live Tables (DLT), which support parameterized, multi-tenant pipelines and robust orchestration.
  • Unity Catalog provides strong data isolation and governance at the client level, even when sharing pipelines.

Platform Features Enabling This Pattern:

  • Databricks Workflows: Orchestrate parameterized, multi-tenant pipelines.
  • Delta Live Tables (DLT): Declaratively define ETL flows partitioned by client.
  • Unity Catalog: Fine-grained access control and catalog/schema separation per client.

Extra tips:
Leverage partitioning and schema separation by client within each layer, and use centralized pipelines to tune job frequencies and resource usage.

Summary:
Organizing by per-table, multi-tenant pipelines is Databricksโ€™ best practice for efficient, standardized, and easily-governed medallion data flows at scale.

I hope this helps.

Best,

Sarah

mbanxp
New Contributor III

Hi sarahbhord !!!

 

Thanks very much for the useful reply, it really helps understanding the best approach to follow. In my case I have roughly the following architecture:

mbanxp_0-1760087606665.png

Based on the the approach of independent pipelines per table for all clients, what would be your recommendation ?