Re: Ingest data from snowflake to databricks

aharisaibabu · ‎06-06-2026

Hi Louis,

Thanks for the detailed explanation and research.

I have a follow-up question regarding Lakehouse Federation. When I review the documentation, Databricks describes Query Federation as being intended for scenarios such as on-demand reporting, proof-of-concept work, exploratory ETL, and incremental migrations. The documentation also explicitly states that Query Federation is meant for use cases where "you don't want to ingest data into Databricks."

Given that guidance, I'm trying to reconcile it with the recommendation to use Lakehouse Federation as part of a long-term ingestion strategy from Snowflake to Databricks.

For a production use case where the objective is to ingest large volumes of data from Snowflake into Delta tables using custom SQL queries and scheduled pipelines, would you still recommend the Lakehouse Federation + Lakeflow Connect approach over the Snowflake Spark Connector?

My concern is that Federation appears to be positioned primarily as a query/access layer rather than a high-volume ingestion mechanism. Is the expectation that Lakeflow Connect effectively addresses that gap and therefore becomes the preferred long-term architecture, or are there scenarios where the Snowflake Spark Connector remains the better choice for large-scale ingestion workloads?

I'd appreciate your perspective on how Databricks intends these technologies to be used together for enterprise-scale ingestion patterns.

References:
1. https://docs.databricks.com/aws/en/query-federation/database-federation