Re: Driver memory utilization grows continuously d... - Databricks Community

Hi @tsam ,

Can you share few details:

Which DBR is the job on?
How many DEEP CLONEs you need to run in total?
What is the parallelism of the for-each task?
Are the cloned tables optimized (e.g. there is no "small file problem")?
Can you share the Heap Histogram of the Driver (can be found in the Spark UI)

In parallel, a simple fix that I can suggest is to run it on the most recent DBR version.

Best regards,