cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Use DLT from another pipeline

AxelBrsn
New Contributor III

Hello, I have a question.

Context :

I have a Unity Catalog organized with three schemas (bronze, silver and gold). Logically, I would like to create tables in each schemas.

I tried to organize my pipelines on the layers, which mean that I would like to have three pipelines :

  1. Bronze, with destination the bronze schema
  2. Silver, with destination the silver schema
  3. Gold, with destination the gold schema

When I try to start the Silver pipeline, I have an error that the dataset of the bronze table is not defined in the pipeline.

The bronze pipeline worked pretty good, and DLT exists in the bronze schema.

My question is : How can I use DLT from other pipelines?

Currently, my query to create the silver DLT is :

CREATE OR REPLACE LIVE TABLE silver_table
USING DELTA
AS
SELECT
MY DATA TRANSFORMATIONS
from LIVE.bronze_table

Thanks for your help,

1 ACCEPTED SOLUTION

Accepted Solutions

AxelBrsn
New Contributor III

Hello, thanks for the answers @YuliyanBogdanov, @standup1.
So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!

But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Silver one.

View solution in original post

3 REPLIES 3

standup1
Contributor

To my knowledge and as of today, DLT does not support multi schemas and you can’t cross from one pipeline to another using “live.table”. However, live table is just a materialized view. You can change your script to create materialized view instead of (live table) but don’t use from live.table (This only works if it is in the same pipeline) . Try from “schema”.table. Something like this 

CREATE MATERIALIZED VIEW my_silver_table 
AS
SELECT count(distinct event_id) as event_count from my_bronze_table;

 

YuliyanBogdanov
New Contributor III

Hey @AxelBrsn,

Unfortunately this is a limitation with DLT as far as my experience goes. You should organize the pipelines in a way that they encompass the full Bronze/Silver/Gold flow, since you don't have control over the schema if you want to make the most out of DLT. You can always read from registered tables, but this can lead to loss of lineage and/or other problems:

Reading from UC: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#batch-ingestion-from-a-unity-cat...

Reading from the Hive Metastore: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#batch-ingestion-from-a-unity-cat...

Using materialized views as @standup1 suggested is also an option, but the cleanest way would be to go with reorganizing your pipelines to follow the full medallion structure from bronze to gold.

 

 

AxelBrsn
New Contributor III

Hello, thanks for the answers @YuliyanBogdanov, @standup1.
So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!

But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Silver one.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group