topic Lakeflow Connect - Column filtering in Data Engineering

Lakeflow Connect - Column filtering

amitpm — Tue, 17 Jun 2025 06:45:05 GMT

Hi community , I am interested in learning more about the feature that was mentioned in recent summit about query pushdown in lakeflow connect for SQL server. I believe this feature will allow to select only the required columns from source tables. Is there any indication when this feature will be available for trial? Until we wait for this feature are there other alternatives that anyone recommends?

Re: Lakeflow Connect - Column filtering

Isi — Sun, 22 Jun 2025 12:01:31 GMT

Hey @amitpm

According to the documentation, this feature is currently in Public Preview, so if your Databricks account has access to public preview features, you can reach out to support to enable it and start testing performance.

Setup guide for Lakehouse Federation with SQL Server

Supported pushdowns (predicates, projections, etc.)

While Lakehouse Federation has some performance limitations, it’s a solid alternative if you want to avoid using the Spark connector and prefer a declarative connection model.

Comparison between Lakehouse Federation vs. LakeFlow Connect

Known limitations of Lakehouse Federation

Hope this helps, 🙂

Isi

Re: Lakeflow Connect - Column filtering

Ashwin_DSA — Fri, 05 Jun 2026 13:39:45 GMT

Hi @amitpm,

Just closing the loop on this thread, and apologies for the very late follow-up.

The feature being discussed here maps to Lakeflow Connect query-based connectors for SQL Server, and that capability is now available. Query-based connectors for SQL Server were announced in Public Preview in April 2026, and more recently, the team confirmed they were going GA and that the documentation had been updated accordingly.

If your original goal was to avoid copying every column from the source, Lakeflow Connect now has documented support for selecting columns to ingest, including include_columns and exclude_columns, and it also supports row filtering for query-based connectors so filters can be applied at the source.

So, in short, you no longer need to wait on this. If you are evaluating SQL Server ingestion, the query-based path is the most relevant option when you want scheduled incremental ingestion with a cursor column and more control over what data is brought across. If you need the traditional CDC or CT approach instead, the standard SQL Server Lakeflow Connect connector is still available, and if the requirement is zero-copy access rather than ingestion, Lakehouse Federation for SQL Server is also worth considering.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.