Merge issue with column mask delta tables
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2025 01:45 AM
Facing issue when doing merge of dataframe to delta table which has mask applied on two of the columns.
Code
DeltaTable.forName(sparkSession=spark,tableOrViewName=f'{catalog}.{schema}.{table_name}').alias('target').merge(
new_df.alias('updates'),
'updates.customerID = target.customerID'
).whenNotMatchedInsertAll().execute()Error
[MISSING_ATTRIBUTES.RESOLVED_ATTRIBUTE_APPEAR_IN_OPERATION] Resolved attribute(s) "first_name" missing from "customerID", "first_name", "last_name", "city", "state" in operator !Project [customerID#153L, redact AS first_name#178, redact AS last_name#179, redact AS city#180, redact AS state#181]. Attribute(s) with the same name appear in the operation: "first_name".
Please check if the right attribute(s) are used. SQLSTATE: XX000
File <command-6480109877101749>, line 4
1 DeltaTable.forName(sparkSession=spark,tableOrViewName=f'{catalog}.{schema}.{table_name}').alias('target').merge(
2 new_df.alias('updates'),
3 'updates.customerID = target.customerID'
----> 4 ).whenNotMatchedInsertAll().execute()
File /databricks/spark/python/pyspark/sql/connect/client/core.py:2377, in SparkConnectClient._handle_rpc_error(self, rpc_error)
2363 raise SparkConnectGrpcException(
2364 "Python versions in the Spark Connect client and server are different. "
2365 "To execute user-defined functions, client and server should have the "
(...)
2373 "sqlState", default=SparkConnectGrpcException.CLIENT_UNEXPECTED_MISSING_SQL_STATE),
2374 ) from None
2375 # END-EDGE
-> 2377 raise convert_exception(
2378 info,
2379 status.message,
2380 self._fetch_enriched_error(info),
2381 self._display_server_stack_trace(),
2382 ) from None
2384 raise SparkConnectGrpcException(
2385 message=status.message,
2386 sql_state=SparkConnectGrpcException.CLIENT_UNEXPECTED_MISSING_SQL_STATE, # EDGE
2387 ) from None
2388 else:Merge works fine with spark.sql or with %sql but have issue with python syntax.
If the column mask is removed from the table, it works fine.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2025 07:32 AM
@arshadnehal are you saying that you're not satisfied with the SQL solution and you're seeking the python equivalent?
Seems like an interesting problem!
All the best,
BS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-11-2025 07:53 AM
It looks like Delta Lake APIs (i.e. DeltaTable... ) are not supported with Row filters and column masks.
Please see limitations: https://docs.databricks.com/aws/en/tables/row-and-column-filters#limitations