Databricks Community

mbanxp · ‎10-07-2025

Hi there !!! I would like to find the most suitable orchestration process to promote data between medallion layers I need to solve the following key architectural decision for scaling my multi-tenant data lake in Databricks.

My setup:

Independent medallion architecture per client (Landing → Bronze → Silver → Gold per client)
Identical schema across all clients (same data model)
Multiple tables per layer (each with specific transformations)

What would be the best approach in Databricks to orchestrate the data promotion between layers ?

Independent pipelines per client for all tables
Independent pipelines per client and table
Independent pipelines per table for all clients

Thanks in advance.

sarahbhord · ‎10-08-2025

Hey mbanxp!

The most scalable and maintainable orchestration pattern for multi-tenant medallion architectures in Databricks is to build independent pipelines per table for all clients, with each pipeline parameterized by client/tenant.

Why this approach?

Centralizes business logic for each table (reduces code duplication).
Makes onboarding new clients easy—just add configuration, don't duplicate pipeline code.
Scales well as data and client count grow.
Fits perfectly with Databricks Workflows and Delta Live Tables (DLT), which support parameterized, multi-tenant pipelines and robust orchestration.
Unity Catalog provides strong data isolation and governance at the client level, even when sharing pipelines.

Platform Features Enabling This Pattern:

Databricks Workflows: Orchestrate parameterized, multi-tenant pipelines.
Delta Live Tables (DLT): Declaratively define ETL flows partitioned by client.
Unity Catalog: Fine-grained access control and catalog/schema separation per client.

Extra tips:
Leverage partitioning and schema separation by client within each layer, and use centralized pipelines to tune job frequencies and resource usage.

Summary:
Organizing by per-table, multi-tenant pipelines is Databricks’ best practice for efficient, standardized, and easily-governed medallion data flows at scale.

I hope this helps.

Best,

Sarah

mbanxp · ‎10-10-2025

Hi sarahbhord !!!

Thanks very much for the useful reply, it really helps understanding the best approach to follow. In my case I have roughly the following architecture:

Based on the the approach of independent pipelines per table for all clients, what would be your recommendation ?

Databricks Community

Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples