Hi experts,
I have defined my DLT Pipeline as follows:
-- Define a streaming table to ingest data from a volume CREATE OR REFRESH STREAMING TABLE pumpdata_bronze TBLPROPERTIES ("myCompanyPipeline.quality" = "bronze") AS SELECT * FROM cloud_files("abfss://xxx@xxx.dfs.core.windows.net/xxx/*/*/*/*/*.JSON","JSON"); --Define a streaming table to ingest data from a volume CREATE OR REFRESH STREAMING TABLE pumpdata_silver PARTITIONED BY (extracted_date) COMMENT "The cleaned sales orders with valid order_number(s) and partitioned by order_datetime." TBLPROPERTIES ("myCompanyPipeline.quality" = "silver") AS SELECT DATE(EnqueuedTimeUtc) AS extracted_date, DATE_FORMAT(EnqueuedTimeUtc, 'HH:mm:ss') AS extracted_time, ROUND(Body:distance, 2) AS distance FROM STREAM(bstdwh.pumpdata_bronze) where Body is not null;
When I start this pipeline, I expect the Bronze table to refresh first, followed by the Silver table after its completion. However, both run in parallel, causing the Silver table to miss the latest data.
Did I miss some settings?