Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-08-2025 08:37 AM
Hi everyone,
I'm working on a LDP and need help ensuring a downstream table only runs if a primary key unique validation check passes. In something like dbt this is very easy to configure but with LDP it seems to require creating a separate view. Additionally, it's not clear to me how to integrate that extra step into the pipeline graph since checking uniqueness requires aggregating.
Setup:
-- Step 1: Bronze streaming table
CREATE OR REFRESH STREAMING TABLE bronze_data AS
SELECT * FROM STREAM read_files('path/to/data/*.parquet');
-- Step 2: Silver table with deduplication
CREATE OR REFRESH MATERIALIZED VIEW silver_data AS
SELECT * FROM bronze_data
WHERE load_date = (SELECT MAX(load_date) FROM bronze_data);
-- Step 3: Separate PK validation table
CREATE OR REFRESH PRIVATE MATERIALIZED VIEW silver_data_pk_check(
CONSTRAINT unique_pk EXPECT (record_count = 1)
ON VIOLATION FAIL UPDATE
) AS
SELECT primary_key, COUNT(*) as record_count
FROM silver_data
GROUP BY primary_key;
-- Step 4: Gold/Analytics table
CREATE OR REFRESH MATERIALIZED VIEW gold_data AS
SELECT * FROM silver_data
LEFT JOIN dimension_table ON ...;I guess i could force the dependency with a cross join or where exists but the whole thing feels weird. Is there a pattern how to do this?