Re: SCD-2 backfilling with streaming tabels

User16502773013 · ‎02-23-2026

Hello JD,

Here is a suggested approach:

Within the single pipeline that owns dim_customer :

1. Declare the streaming table with __START_AT and __END_AT .

2. Add a backfill flow:

CREATE FLOW backfill_dim_customer AS
INSERT INTO ONCE dim_customer BY NAME
SELECT
customer_id,
...,
MIN(order_date) AS __START_AT,
CAST(NULL AS TIMESTAMP) AS __END_AT
FROM raw_orders
GROUP BY customer_id, ...;

3. Add an AUTO CDC SCD2 flow using your CDC source and a proper SEQUENCE BY column that reflects change/effective time.
That will give you SCD2 history where the first version for each customer starts in 2019 (or whenever the first order occurred), and subsequent changes will be tracked automatically going forward.