cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to implement MERGE operations in Lakeflow Declarative Pipelines

yit
Contributor

Hey everyone,

We’ve been using Autoloader extensively for a while, and now we’re looking to transition to full Lakeflow Declarative Pipelines. From what I’ve researched, the reader part seems straightforward and clear.

For the writer, I understand that I can use a sink and provide the necessary options. What I’m not fully clear on is how to implement the MERGE logic. In my current Autoloader setup, I handle this via forEachBatch.

How should this be approached in the Lakeflow Declarative Pipelines framework? Could I use forEachBatch? I did not find any documentation on the topic.

Thanks in advance!

1 REPLY 1

saurabh18cs
Honored Contributor II

Hi @yit Lakeflow supports upsert/merge semantics natively for Delta tables unlile ForEachBatch

Instead of writing custom forEachBatch code, you declare the merge keys and update logic in your pipeline configuration.Lakeflow will automatically generate the necessary MERGE statements and handle upserts for you.

 

e.g.

sinks:
my_delta_sink:
type: delta
path: /mnt/delta/my_table
merge:
keys: ["id"] # columns to match for upsert
whenMatched: update
whenNotMatched: insert

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now