cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Conflicting with predictive optimization.

iercan06
New Contributor II

Hi. We have a continuous DLT pipeline with tables updating every minute and partitioned by partition_key column. Every 3-5 days, we encounter a below conflict error caused by predictive optimization. The pipeline runs fine after restarting, but I need a solution that doesn't require restarts or disabling predictive optimization. Thanks for any help.

[DELTA_METADATA_CHANGED] MetadataChangedException: The metadata of the Delta table has been changed by a concurrent update. Please try the operation again. Conflicting commit: {"timestamp":1771277469269,"userId":"8958709470301504","userName":"71346d5e-590c-49c0-9e13-e01e19f3ddca","operation":"SET TBLPROPERTIES","operationParameters":{"properties":{"delta.workloadBasedColumns.optimizerStatistics":"`partition_key`"}},"job":{"jobId":"52114596145184","jobName":"Predictive Optimization Job-d453e56b-97f1-425d-a33d-841bd8a3771f","jobRunId":"33557931172597","runId":"623858441565966","jobOwnerId":"8958709470301504","triggerType":"manual"},"clusterId":"0216-212153-y6seouuh-v2n","readVersion":247493,"isolationLevel":"WriteSerializable","isBlindAppend":true,"operationMetrics":{},"tags":{"maintenance":"true","delta.rowTracking.preserved":"true"},"engineInfo":"Databricks-Runtime/18.0.x-aarch64-photon-scala2.13","txnId":"a72c1747-833f-4223-8bd6-7fd61cc2aecf"} Refer to https://docs.databricks.com/delta/concurrency-control.html for more details.

3 REPLIES 3

pradeep_singh
Contributor

Here is a similar community post talking about the same issue - https://community.databricks.com/t5/data-engineering/conflict-between-predictive-optimization-and-hi...

Thank You
Pradeep Singh - https://www.linkedin.com/in/dbxdev

My question was marked as spam that's why my colleague has sent it again. Sorry for duplication.

Kirankumarbs
Contributor

 

As per the error message i see who is causing the issue: Predictive Optimization Job-d453e56b-97f1-425d-a33d-841bd8a3771f. This is a classic Optimistic Concurrency Issue!

What's happening is predictive optimization runs a background job that sets table properties (in your case, delta.workloadBasedColumns.optimizerStatistics) to record optimizer statistics. That's a metadata write. Your continuous pipeline is also writing to the same table roughly every minute. When both try to commit around the same time, you get a DELTA_METADATA_CHANGED conflict because Delta uses optimistic concurrency, if the metadata version changed between when your pipeline read it and when it tries to commit, it throws this error.

You can exclude just the specific tables that are written to by your continuous pipeline from predictive optimization, while keeping it enabled for everything else. You do this at the table level:

 
ALTER TABLE your_catalog.your_schema.your_tableSET TBLPROPERTIES ('delta.enablePredictiveOptimization' = 'false');

This is probably the cleanest fix. Your continuous tables don't benefit much from predictive optimization anyway, since DLT handles compaction and optimization on its own for managed streaming tables.