Update Delta Table with Apache Spark connector
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-14-2025 01:39 AM
Hi everyone. I'd like to ask a question about updating Delta tables using the Apache Spark connector.
Let's say I have two tables: one is a product dimension table with items from my shop, and the other contains a single column with the IDs of the products I want to update. According to the documentation (Delta Update), I’ve used a merge operation as follows:
deltaTable
.as("current")
.merge(
updates.as("updates"),
"current.product_id = updates.product_id"
)
.whenMatched
.updateExpr(Map("description" -> "null"))
.execute()My question is: is it okay to use the merge operation strictly for updates, or is it recommended to always include an insert clause as well? Also, is it possible to use the 'update' operation described in the documentation to modify one table based on another?