cancel
Showing results for 
Search instead for 
Did you mean: 
Announcements
Stay up-to-date with the latest announcements from Databricks. Learn about product updates, new features, and important news that impact your data analytics workflow.
cancel
Showing results for 
Search instead for 
Did you mean: 

Announcing Backfill Runs in Lakeflow Jobs for Higher Quality Downstream Data

Sujitha
Databricks Employee
Databricks Employee

Managing complex data ecosystems with numerous sources and constant updates is challenging for data engineering teams. They often face unpredictable but common issues like cloud vendor outages, broken connections to data sources, late-arriving data, or even data quality issues at the source. Other times, they have to deal with sudden business rule changes that impact the entire data orchestration.

The result? Downstream data is stale, inaccurate, or incomplete. While backfilling - rerunning jobs with historical data - is a common need and solution to this, traditional manual and ad hoc backfills are tedious, error-prone, and don't scale, hindering efficient resolution of common data quality issues.

In short, backfill runs in Lakeflow Jobs helps you:

  • Ensure that you have the most complete and up-to-date datasets
  • Simplify and accelerate access to historical data with an intuitive, no-code interface
  • Improve data engineering productivity by eliminating the need for manual data searches and backfill processes

Click here to continue reading.

2 REPLIES 2

BS_THE_ANALYST
Esteemed Contributor III

@Sujitha very cool!

I've been learning all about Lakeflow as part of the Data Engineering Associate certification. This update couldn't have come at a better time! 

Can't wait to build something out with this 😎.

All the best,
BS

DebIT2011
New Contributor III

This will be extremely helpful and save us a lot of time. I’m really excited about it and look forward to using it as soon as it’s available for general use. I am currently waiting for the Lakeflow connector for the PostgreSQL database—could you please let me know when it will be generally available (GA)?