topic OPTIMIZE in parallel with actual data load in Data Engineering

OPTIMIZE in parallel with actual data load

noorbasha534 — Mon, 21 Jul 2025 08:51:37 GMT

Dear all

If I understand correctly, OPTIMIZE cannot run in parallel with actual data load. We see 'concurrent update' errors in our environment if this happens; due to which we are unable to dedicate a maintenance window for the tables health.

And, I saw a presentation from DAIS 2025 that says liquid clustering can run in parallel with actual data load.

Please correct the understanding here.

Appreciate the mindshare...

Re: OPTIMIZE in parallel with actual data load

MariuszK — Mon, 21 Jul 2025 09:02:58 GMT

Liquid clustering reorganizes data incrementally, which will work faster because it optimizes only new data. Compared to Z-order there is a different algorithm for data organization (Hilbert curve) that alows incremental.

Re: OPTIMIZE in parallel with actual data load

noorbasha534 — Mon, 21 Jul 2025 09:57:55 GMT

@MariuszK this does not answer my question. Can I run OPTIMIZE in parallel with the data load of a liquid clustered table?

Re: OPTIMIZE in parallel with actual data load

szymon_dybczak — Mon, 21 Jul 2025 10:25:17 GMT

Hi @noorbasha534 ,

Yes, Liquid Clustering optimization can be executed on delta tables automatically or manually, at write time with Auto Compaction enabled or at any time using OPTIMIZE command, respectively.

Liquid Clustering - The Internals of Delta Lake

Additionally, it is mentioned in below blog post. Look for clustering on-write:

Announcing General Availability of Liquid Clustering | Databricks Blog

Re: OPTIMIZE in parallel with actual data load

MariuszK — Mon, 21 Jul 2025 12:39:50 GMT

@noorbasha534, This is a good question to clarify this topic. According to documentation, yes, but honestly speaking, I haven't had a chance to check it in the described scenario.

Re: OPTIMIZE in parallel with actual data load

noorbasha534 — Mon, 21 Jul 2025 21:13:33 GMT

@MariuszK @szymon_dybczak thanks both. appreciate your support.