03-27-2024 09:06 AM
Hello, I have a question.
Context :
I have a Unity Catalog organized with three schemas (bronze, silver and gold). Logically, I would like to create tables in each schemas.
I tried to organize my pipelines on the layers, which mean that I would like to have three pipelines :
When I try to start the Silver pipeline, I have an error that the dataset of the bronze table is not defined in the pipeline.
The bronze pipeline worked pretty good, and DLT exists in the bronze schema.
My question is : How can I use DLT from other pipelines?
Currently, my query to create the silver DLT is :
CREATE OR REPLACE LIVE TABLE silver_table USING DELTA AS SELECT MY DATA TRANSFORMATIONS from LIVE.bronze_table |
Thanks for your help,
04-02-2024 06:07 AM
Hello, thanks for the answers @YuliyanBogdanov, @standup1.
So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!
But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Silver one.
03-28-2024 03:33 AM
To my knowledge and as of today, DLT does not support multi schemas and you can’t cross from one pipeline to another using “live.table”. However, live table is just a materialized view. You can change your script to create materialized view instead of (live table) but don’t use from live.table (This only works if it is in the same pipeline) . Try from “schema”.table. Something like this
CREATE MATERIALIZED VIEW my_silver_table AS SELECT count(distinct event_id) as event_count from my_bronze_table;
03-28-2024 08:15 AM
Hey @AxelBrsn,
Unfortunately this is a limitation with DLT as far as my experience goes. You should organize the pipelines in a way that they encompass the full Bronze/Silver/Gold flow, since you don't have control over the schema if you want to make the most out of DLT. You can always read from registered tables, but this can lead to loss of lineage and/or other problems:
Reading from UC: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#batch-ingestion-from-a-unity-cat...
Reading from the Hive Metastore: https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#batch-ingestion-from-a-unity-cat...
Using materialized views as @standup1 suggested is also an option, but the cleanest way would be to go with reorganizing your pipelines to follow the full medallion structure from bronze to gold.
04-02-2024 06:07 AM
Hello, thanks for the answers @YuliyanBogdanov, @standup1.
So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!
But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Silver one.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group