Updating a streaming table in dlt

ashraf1395
Honored Contributor

Can we update a streaming table in dlt my source and target will be same. That is the update should be made on the same table. If yes then can you guide me how.

I tried append_flow but it just appends data

CDC I am not sure whether we can have both target and source table same there or not

Walter_C
Databricks Employee
Databricks Employee

You can define the CDC flow to update the streaming table. This involves reading from the same table and applying changes.

@Dlt.table(
    name="my_streaming_table",
    comment="This table is updated using CDC",
    table_properties={"quality": "silver"}
)
@dlt.expect_or_drop("valid_data", "column_name IS NOT NULL")
def update_table():
    source_df = dlt.read_stream("my_streaming_table")
    changes_df = source_df.filter("change_type = 'update'")

    return changes_df

Ensure your pipeline is configured to use the CDC functionality. This can be done in the pipeline settings.