cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Lakeflow Connect Data ingestion from SQL Server and PostgreSQL to Databricks with CDC

shan-databricks
Databricks Partner
We have a requirement to use Lakeflow Connect for data ingestion from SQL Server and PostgreSQL into Databricks with CDC and Lakehouse federation. I would like to understand the pros and cons of Lakeflow Connect in the following areas
 
Firewall/gateway considerations
CDC capabilities
Reliability
Overall success of Lakeflow implementation
Overall success of Lakehouse federation
1 REPLY 1

ziafazal
Databricks Partner

Hi @shan-databricks 

You should setup postgresql for ingestion via Lakeflow connect. Once your Postgres logical replication is ready you have to create ingestion pipelines which comprise a gateway and ingestion pipeline. Your gateway pipeline is continuous pipeline to pull changed data from the source Postgres database and stores it into you staging catalog in Databricks. Ingestion gateway pipeline should use a compute which resides in your Databricks VPC and that VPC should be whitelisted in your firewall. Second pipeline use serverless compute to move changed data from stage catalog to target catalog's bronze schema.

Thanks