topic Re: Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks in Data Engineering

Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks

mbanxp — Tue, 07 Oct 2025 14:36:52 GMT

Hi there !!! I would like to find the most suitable orchestration process to promote data between medallion layers I need to solve the following key architectural decision for scaling my multi-tenant data lake in Databricks.

My setup:

Independent medallion architecture per client (Landing → Bronze → Silver → Gold per client)
Identical schema across all clients (same data model)
Multiple tables per layer (each with specific transformations)

What would be the best approach in Databricks to orchestrate the data promotion between layers ?

Independent pipelines per client for all tables
Independent pipelines per client and table
Independent pipelines per table for all clients

Thanks in advance.

Re: Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks

sarahbhord — Wed, 08 Oct 2025 13:09:08 GMT

Hey mbanxp!

The most scalable and maintainable orchestration pattern for multi-tenant medallion architectures in Databricks is to build independent pipelines per table for all clients, with each pipeline parameterized by client/tenant.

Why this approach?

Centralizes business logic for each table (reduces code duplication).
Makes onboarding new clients easy—just add configuration, don't duplicate pipeline code.
Scales well as data and client count grow.
Fits perfectly with Databricks Workflows and Delta Live Tables (DLT), which support parameterized, multi-tenant pipelines and robust orchestration.
Unity Catalog provides strong data isolation and governance at the client level, even when sharing pipelines.

Platform Features Enabling This Pattern:

Databricks Workflows: Orchestrate parameterized, multi-tenant pipelines.
Delta Live Tables (DLT): Declaratively define ETL flows partitioned by client.
Unity Catalog: Fine-grained access control and catalog/schema separation per client.

Extra tips:
Leverage partitioning and schema separation by client within each layer, and use centralized pipelines to tune job frequencies and resource usage.

Summary:
Organizing by per-table, multi-tenant pipelines is Databricks’ best practice for efficient, standardized, and easily-governed medallion data flows at scale.

I hope this helps.

Best,

Sarah

Re: Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks

mbanxp — Fri, 10 Oct 2025 09:25:49 GMT

Hi sarahbhord !!!

Thanks very much for the useful reply, it really helps understanding the best approach to follow. In my case I have roughly the following architecture:

Based on the the approach of independent pipelines per table for all clients, what would be your recommendation ?