Databricks Community

rcostanza · ‎07-29-2025

I have a small (under 20 tables, all streaming) DLT pipeline running in triggered mode, scheduled every 15min during the workday. For development I've set `pipelines.clusterShutdown.delay` to avoid having to start a cluster every update.

I've noticed that the updates' runtimes are progressively worse as the time goes on, ultimately doubling in time after only 2h. It increases progressively even after updates on where there are no updates to any of the tables; each table's update duration is individually low but the overall runtime is high. Eventually we have to let the compute shut down to restart and regain performance.

Cluster metrics show nothing out of ordinary; even though free memory slowly decreases over time there's still enough, and CPU load is way below its limit even at its peak. There's nothing obviously wrong in the logs either.

I'm assuming restarting the cluster periodically is expected somehow, but what if it were a continuous pipeline instead where it would stay up until manually shut down, wouldn't this issue be more prominent?

Is there a way to mitigate this without restarting the cluster several times a day?

jerrygen78 · ‎07-29-2025

You're right to be concerned — this sounds like a classic case of memory or resource leakage over time, which can affect long-running jobs even if metrics look okay on the surface. In triggered DLT (now Lakeflow) pipelines, tasksand state can accumulate in memory, especially with streaming workloads. For continuous pipelines, this degradation would likely be worse. While a restart is the simplest fix, you can mitigate this by optimizing stateful operations (like joins and aggregations), enabling state cleanup settings, and ensuring checkpoint locations aren't bloating. Also consider using autoscaling clusters with auto-shutdown enabled between runs to reset state without manual restarts.

Ask Cha

Databricks Community

Lakeflow pipeline (formerly DLT pipeline) performance progressively degrades on a persistent cluster

Join Us as a Local Community Builder!

Lakehouse, Lagers & Legends — Bangalore Meetup | December 13

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐