Databricks Community

manish_de · ‎05-08-2026

In ingestion pipeline, for query based connector there is option of selecting batch snapshot instead of column name under dropdown - Cursor column. If I choose batch snapshot, will the databricks engine run select * from my source table, say Sql server as per job schedule. Or does it do something intelligent use some where clause and avoid selecting the whole table

amirabedhiafi · ‎05-10-2026

Hi @manish_de !

For query based lakeflow connect, DBKS is not doing CDC magic unless you provide a valid cursor column.

If you select batch snapshot instead of a cursor column, you should treat it as a full snapshot or full load each scheduled run.

In practice, DBKS needs to read the source result set again from SQL Server then apply or merge the result into the destination. It may not literally issue SELECT * in every case because it can project configured columns but conceptually it is scanning the source query without an incremental WHERE cursor > last_watermark filter.

When a cursor column is configured on each run DBKS retrieves rows where the cursor value is greater than the previously stored high water mark and if you do not select a monotonically increasing cursor column the connector performs a full load on each run.

Think of it like batch snapshot = full source snapshot per run.

If this answer resolves your question, could you please mark it as “Accept as Solution”? It will help other users quickly find the correct fix.

Senior BI/Data Engineer | Microsoft MVP Data Platform | Microsoft MVP Power BI | Power BI Super User | C# Corner MVP

manish_de · ‎05-10-2026

@amirabedhiafi I just wanted to double check, there will be no where clause in the sql sent everytime to my database for batch snapshot?

amirabedhiafi · ‎05-11-2026

Yes ! because with batch snapshot DBKS should be treated as doing a full snapshot read each run and won't automatically add WHERE to avoid scanning the whole source table.

If this answer resolves your question, could you please mark it as “Accept as Solution”? It will help other users quickly find the correct fix.

Senior BI/Data Engineer | Microsoft MVP Data Platform | Microsoft MVP Power BI | Power BI Super User | C# Corner MVP

michaelfriendly · ‎05-11-2026

From what I’ve seen, choosing batch snapshot usually means the connector treats each run like a full snapshot load, so yes, it may execute something very close to a `SELECT *` on the source table unless the platform adds its own partitioning or optimization behind the scenes. If no cursor column or incremental filter is defined, there generally isn’t a smart `WHERE` clause to reduce reads, so checking the generated rbtv query logs in Databricks or SQL Server profiler would give the clearest answer.

michaelfriendly · ‎05-20-2026

@rbtv It may execute something very similar to a `SELECT *` on the source table unless the platform adds its own partitioning or optimisation behind the scenes. From what I've observed, selecting batch snapshot often means the connector handles each run like a full snapshot load. The most straightforward solution would be to examine the generated rbtv query logs in Databricks or SQL Server Profiler. If no cursor column or incremental filter is specified, there typically isn't a smart `WHERE` clause to reduce reads.

Databricks Community

query based connector snapshot feature

Congratulations Databricks Partners! You're Now Officially Recognized in the Databricks Community

Solution Accelerator Series | Measure Ad Effectiveness With Multi-Touch Attribution

Govern AI Spend at Scale: A Data-Driven Approach to AI Governance | Webinar

Databricks AMER Learning Festival | Virtual Training

Introducing the Genie Hub: Ask Questions, Share Builds, and Master Conversational Analytics