cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

SQL merge to update one of the nested column

SagarJi
New Contributor II

 

I am having existing delta-lake as target, and the small set of records at hand as CURRENT_BATCH,
I have a requirement to update dateTimeUpdated column inside parent2, using following merge query.
========

MERGE INTO mydataset AS target
USING CURRENT_BATCH AS incoming
ON target.parent1.companyId=incoming.parent1.companyId AND target.parent1.id=incoming.parent1.id AND incoming.is_deleted <> 'N'
WHEN MATCHED AND incoming.parent2.dateTimeUpdated > target.parent2.dateTimeUpdated
THEN UPDATE SET
target.is_deleted = incoming.is_deleted,
target.parent2.dateTimeUpdated = incoming.parent2.dateTimeUpdated
WHEN NOT MATCHED THEN INSERT *
======

However this query fails giving me following error

[DELTA_MERGE_UNEXPECTED_ASSIGNMENT_KEY]
Unexpected assignment key: class org.apache.spark.sql.catalyst.plans.logical.Assignment -
assignment(parent2#6623.dateTimeUpdated, parent2#6664.dateTimeUpdated)

It works fine if I update directly parent i.e target.parent2 = incoming.parent2, but I just need to update one of the nested columns, how do I do that?

 

1 REPLY 1

SagarJi
New Contributor II

I am using Apache Spark 3.5.1 and deltalake version 3.1.0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group