cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Is Lakeflow Connect SCD Type 2 output is incompatible with Spark dec pipeline streaming tables?

lrm_data
New Contributor III

## Problem

When using Lakeflow Connect to ingest from SQL Server with SCD Type 2 enabled, any downstream Streaming Table (auto cdc flow) in a Spark Declarative pipeline will fail with the following error:

"An error occurred because we detected an update or delete to one or more rows in the source table. Streaming tables may only use append-only streaming sources."

This happens because Lakeflow Connect applies MERGE operations to its bronze target table when writing SCD2 history — updating __END_AT on existing rows when new versions arrive. This makes the bronze table non-append-only, which violates the streaming table contract.

We designed this using streaming architecture as we may want to enable continuous data processing. However, for now, we can process this in batch. 

These tables are large so a materialized view may not be an option. Auto CDC from snapshot is not an option as this expects a non-streaming source. What is the recommendation for processing data in later layers? 

1 ACCEPTED SOLUTION

Accepted Solutions

lrm_data
New Contributor III

Following up with a recommendation from Databricks:

For tables that need incremental processing - 

SQL Server →  Lakeflow Connect → Bronze SCD2 Streaming Table (CDF enabled → consume CDF, not base table using AUTO CDC → Silver SCD2 Streaming Table → Downstream MVs or Streaming Tables

View solution in original post

1 REPLY 1

lrm_data
New Contributor III

Following up with a recommendation from Databricks:

For tables that need incremental processing - 

SQL Server →  Lakeflow Connect → Bronze SCD2 Streaming Table (CDF enabled → consume CDF, not base table using AUTO CDC → Silver SCD2 Streaming Table → Downstream MVs or Streaming Tables