Databricks Community

AxelBrsn · ‎03-27-2024

Hello, I have a question.

Context :

I have a Unity Catalog organized with three schemas (bronze, silver and gold). Logically, I would like to create tables in each schemas.

I tried to organize my pipelines on the layers, which mean that I would like to have three pipelines :

Bronze, with destination the bronze schema
Silver, with destination the silver schema
Gold, with destination the gold schema

When I try to start the Silver pipeline, I have an error that the dataset of the bronze table is not defined in the pipeline.

The bronze pipeline worked pretty good, and DLT exists in the bronze schema.

My question is : How can I use DLT from other pipelines?

Currently, my query to create the silver DLT is :

CREATE OR REPLACE LIVE TABLE silver_table
USING DELTA
AS
SELECT
MY DATA TRANSFORMATIONS
from LIVE.bronze_table

Thanks for your help,

AxelBrsn · ‎04-02-2024

Hello, thanks for the answers @YuliyanBogdanov, @standup1.
So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!

But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Silver one.

View solution in original post

standup1 · ‎03-28-2024

To my knowledge and as of today, DLT does not support multi schemas and you can’t cross from one pipeline to another using “live.table”. However, live table is just a materialized view. You can change your script to create materialized view instead of (live table) but don’t use from live.table (This only works if it is in the same pipeline) . Try from “schema”.table. Something like this

CREATE MATERIALIZED VIEW my_silver_table 
AS
SELECT count(distinct event_id) as event_count from my_bronze_table;

YuliyanBogdanov · ‎03-28-2024

Hey @AxelBrsn,

Unfortunately this is a limitation with DLT as far as my experience goes. You should organize the pipelines in a way that they encompass the full Bronze/Silver/Gold flow, since you don't have control over the schema if you want to make the most out of DLT. You can always read from registered tables, but this can lead to loss of lineage and/or other problems:

Reading from UC: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#batch-ingestion-from-a-unity-cat...

Reading from the Hive Metastore: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#batch-ingestion-from-a-unity-cat...

Using materialized views as @standup1 suggested is also an option, but the cleanest way would be to go with reorganizing your pipelines to follow the full medallion structure from bronze to gold.

AxelBrsn · ‎04-02-2024

Hello, thanks for the answers @YuliyanBogdanov, @standup1.
So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!

But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Silver one.