cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Delta MERGE fails with row filter — “Cannot find column index for attribute 'account_id'”

vim17
New Contributor II

Problem:
I’m getting the below error when performing a MERGE (or any other DML command) on a Delta table with a row filter in Databricks.
Error: Cannot find column index for attribute 'account_id#48219' in: Map(transaction_id#47260 -> 5, file_path#47253 -> 0, numPhysicalRecords#49686L -> 3, defaultRowCommitVersion#47262L -> 7, version#47265L -> 9, size#47255L -> 1, account_number#48983 -> 10, baseRowId#47261L -> 6, transaction_tags#47259 -> 4, modificationTime#47256L -> 2, partitionValues_parsed#47264 -> 😎

I’ve verified that account_id is present in the table. For some reason, it is not able to access it in both cases where a row filter allows/blocks data from the user.


Details:
I’ve created a minimal repro that can demonstrate the issue, please find it below.
merge operation:
target_txns.alias("target") \
.merge(source_txns.alias("source"), merge_cond) \
.whenMatchedUpdateAll() \
.whenNotMatchedInsertAll() \
.execute()

The target table has this row filter:
CREATE FUNCTION fin_data.customer_db.filter_account(acc_id STRING) RETURNS BOOLEAN
RETURN acc_id IN (
SELECT account_id
FROM fin_data.customer_db.authorized_accounts
WHERE email = current_user()
);

facing this error, Any leads/help on this would be appreciated.

4 REPLIES 4

SP_6721
Honored Contributor

Hi @vim17 ,

It might be due to the limitation mentioned here: Row filters and column masks
You could try simplifying the row filter function and then re-running the merge

vim17
New Contributor II

I’ve verified that row filter or column-mask policies do not contain nesting, aggregations, windows, limits, or non-deterministic functions

Amruth_Ashok
Databricks Employee
Databricks Employee

Hi @vim17, I see "partitionValues_parsed#47264" in the Error trace. Is the table partitioned, by any chance? 
Which DBR version are you using?

vim17
New Contributor II

Yes, the table is partitioned on transaction_date.

Sample target table schema:
CREATE TABLE fin_data.customer_db.account_transactions (
transaction_id STRING,
account_id STRING,
transaction_date DATE,
transaction_type STRING,
transaction_amount DECIMAL(18,2),
currency STRING,
merchant_name STRING,
merchant_category STRING,
transaction_tags ARRAY<STRING>,
modificationTime TIMESTAMP
)
USING DELTA
PARTITIONED BY (transaction_date)


Target table sample data
target_txns = spark.createDataFrame([
("txn_001", "ACC123", "2025-08-01", "DEBIT", 150.00, "USD", "Amazon", "E-commerce", ["online"], "2025-08-01 10:00:00"),
("txn_002", "ACC124", "2025-08-02", "CREDIT", 2000.00, "USD", "Payroll", "Salary", ["monthly"], "2025-08-02 09:00:00")
], ["transaction_id", "account_id", "transaction_date", "transaction_type", "transaction_amount", "currency", "merchant_name", "merchant_category", "transaction_tags", "modificationTime"])


Source table sample data
source_txns = spark.createDataFrame([
("txn_003", "ACC123", "2025-08-03", "DEBIT", 75.00, "USD", "Starbucks", "Cafe", ["food"], "2025-08-03 08:00:00"),
("txn_004", "ACC125", "2025-08-04", "DEBIT", 300.00, "USD", "Best ", "Electronics", ["gadget"], "2025-08-04 15:00:00")
], ["transaction_id", "account_id", "transaction_date", "transaction_type", "transaction_amount", "currency", "merchant_name", "merchant_category", "transaction_tags", "modificationTime"])
Currently, we are using 15.4, 16.4 DBR