Databricks Community

surajitDE · ‎03-28-2025

My DLT jobs are experiencing throttling due to the following error message:
[GC (GCLocker Initiated GC) [PSYoungGen: 5431990K->102912K(5643264K)] 9035507K->3742053K(17431552K), 0.1463381 secs] [Times: user=0.29 sys=0.00, real=0.14 secs]
I came across some articles suggesting that switching to G1GC could help resolve this issue.
However, in Serverless and Dedicated clusters for DLT, I don’t see an option to modify the garbage collection settings. Could you provide guidance on how to address this?

Brahmareddy · ‎04-08-2025

Hi surajitDE,

How are you doing today?, As per my understanding, You're absolutely right to look into the GC (Garbage Collection) behavior—when you're seeing messages like GCLocker Initiated GC and frequent young gen collections, it usually means your job is hitting memory pressure, especially in object-heavy tasks like DLT. Switching to G1GC is a common solution in Spark jobs, but as you've noticed, DLT clusters (especially Serverless or managed job clusters) don’t expose options to change JVM parameters like GC settings. That said, there are still a few things you can try: First, optimize your DLT logic by reducing unnecessary transformations or large nested structures in memory. Next, you can try increasing the cluster size or memory per worker, even temporarily, to help mitigate the pressure. If you're using complex JSON or heavy joins, consider flattening early or using .selectExpr() to minimize GC overhead. Lastly, if you're on a dedicated (non-serverless) DLT cluster, you could try switching to a custom cluster policy that gives you access to advanced Spark or JVM settings, though that’s more effort. Unfortunately, in fully serverless environments, GC tuning isn’t supported directly—so focusing on data and transformation optimizations is your best move. Let me know if you want help reviewing your pipeline logic for memory efficiency!

Regards,

Brahma

Databricks Community

How can we change from GC to G1GC in serverless

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples

Portland Data + AI Meetup — Holiday Event - Wednesday, December 3rd