cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Lake flow - Orchestration Layer is moving to where it belongs

balajij8
Contributor

Discussed the BI & Metrics Tax elimination using Databricks Metric Views here. Organizations also face an older more persistent tax — the Ingestion Tax.

To ingest data from a source like Salesforce or SQL Server into your Lakehouse, you typically stitch together various cloud & adhoc tools (ADF, Air flow, Fivetran, Airbyte etc). Managing various scripts & tools for various ingestion is a challenging task.

Organization has a tooling sprawl dues in form of multiple licenses, governance silos and endless management cycles. The era of high friction ingestion is over with Databricks Lake flow Connect. Ingestion is no longer a headache.

The Beginning of the Cloud Ingestion Tax
Databricks was seen primarily as the compute engine in 2019 & organizations felt they needed a separate ingestion tool & connector via cloud factories & ad hoc tools. This created a wall at the head of the stack.

  • Fragmentation: Ingestion happened in one tool & modelling logic in another (dbt/Spark), and orchestration in another.
  • Governance Gap: Data lineage died the moment it left the source tool, leaving transparency a dream.
  • Maintenance Gap: DE had to manually fix the pipeline every time a source system updated its API.

Lake flow Connect:

Lake flow Connect & Jobs part of the broader Lake flow eliminates the taxes by bringing Ingestion, Modelling and Orchestration to a single pane.

  • High Performance Connectors - Whether it’s enterprise SaaS apps or databases, Lake flow Connect provides a simple ingestion for it. No need to manage JDBC drivers etc just to fetch a table.
  • Change Data Capture - Lake flow Connect leverages efficient incremental CDC to ensure the Lakehouse stays fresh with no complexity
  • Governance with Unity Catalog - You can see the field level flow through bronze and silver tables all the way to Metric Views. Data quality can be monitored from the moment it enters Lakehouse

Migrated to Lake flow Connect & Jobs from a custom pipeline (Salesforce and SQL Server) seamlessly. Its compatibility is great. Stop paying the Ingestion & Orchestration Tax by moving to Lakeflow Connect & Jobs

0 REPLIES 0