- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-27-2024 09:06 AM
Hello, I have a question.
Context :
I have a Unity Catalog organized with three schemas (bronze, silver and gold). Logically, I would like to create tables in each schemas.
I tried to organize my pipelines on the layers, which mean that I would like to have three pipelines :
- Bronze, with destination the bronze schema
- Silver, with destination the silver schema
- Gold, with destination the gold schema
When I try to start the Silver pipeline, I have an error that the dataset of the bronze table is not defined in the pipeline.
The bronze pipeline worked pretty good, and DLT exists in the bronze schema.
My question is : How can I use DLT from other pipelines?
Currently, my query to create the silver DLT is :
CREATE OR REPLACE LIVE TABLE silver_table USING DELTA AS SELECT MY DATA TRANSFORMATIONS from LIVE.bronze_table |
Thanks for your help,
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-02-2024 06:07 AM
Hello, thanks for the answers @YuliyanBogdanov, @standup1.
So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!
But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Silver one.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-28-2024 03:33 AM
To my knowledge and as of today, DLT does not support multi schemas and you can’t cross from one pipeline to another using “live.table”. However, live table is just a materialized view. You can change your script to create materialized view instead of (live table) but don’t use from live.table (This only works if it is in the same pipeline) . Try from “schema”.table. Something like this
CREATE MATERIALIZED VIEW my_silver_table AS SELECT count(distinct event_id) as event_count from my_bronze_table;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-28-2024 08:15 AM
Hey @AxelBrsn,
Unfortunately this is a limitation with DLT as far as my experience goes. You should organize the pipelines in a way that they encompass the full Bronze/Silver/Gold flow, since you don't have control over the schema if you want to make the most out of DLT. You can always read from registered tables, but this can lead to loss of lineage and/or other problems:
Reading from UC: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#batch-ingestion-from-a-unity-cat...
Reading from the Hive Metastore: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#batch-ingestion-from-a-unity-cat...
Using materialized views as @standup1 suggested is also an option, but the cleanest way would be to go with reorganizing your pipelines to follow the full medallion structure from bronze to gold.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-02-2024 06:07 AM
Hello, thanks for the answers @YuliyanBogdanov, @standup1.
So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!
But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Silver one.