cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

restarting the cluster always running doesn't free the memory?

jeremy98
Contributor III

Hello community,

I was working on optimising the driver memory, since there are code that are not optimised for spark, and I was planning temporary to restart the cluster to free up the memory.


Screenshot 2025-03-04 at 14.49.44.png

that could be a potential solution, since if the cluster is not working in the first few minutes of each hour it is a good moment to restart it and free up the memory. But I was looking about the standard output and seems that is not free any memory. Why this behaviour? I need to terminate and start the cluster instead of this previous operation?

9 REPLIES 9

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @jeremy98,

Generally, Databricks recommends regularly restarting clusters, particularly interactive ones, for regular clean-up. Restarting or terminating and starting the cluster anew ensures stopping all processes and freeing up memory effectively, therefore the restart should actually clean it up. You can see in your cluster metrics once it is restarted.

Thanks Alberto, for the clarification! Yes, it is true, effectively, the metric UI doubled the logs for the driver and for the worker/s. I think it is a normal behaviour.

Alberto_Umana
Databricks Employee
Databricks Employee

No problem, happy to assist!

Hi, I have another question: Usually, the driver should free memory by itself, but is it possible that the driver fails to do so? Why does this happen, and what issues can arise from this behavior?

Alberto_Umana
Databricks Employee
Databricks Employee

Yes, it is actually possible that driver due to some reason did not free up memory.. if that happens the you will see these kind of failures:

 

  • Unresponsiveness: The driver may become unresponsive, leading to failed health checks and potential restarts or kills by watchdog mechanisms.
  • Frequent Restarts: Continuous memory pressure and GC overhead can cause the driver to restart frequently, leading to interruptions in job executions and degraded performance.
  • Out of Memory (OOM) Conditions: Eventually, the driver might run out of memory, leading to crashes and job failures with explicit OOM errors

 

Exactly, thanks, Alberto! But in general, is it best practice to restart a cluster every week to prevent this issue? Or does this problem happen because the code is not well-written?

Alberto_Umana
Databricks Employee
Databricks Employee

It is best practice to restart the cluster regularly correct! Regularly restarting clusters can help mitigate memory leaks and accumulated GC issues.

And about if it happens because of your code, it depends on what you are doing and if you follow best practices, but would need more insights to tell.

Hi,

The code synchronizes Databricks with PostgreSQL by identifying differences and applying INSERT, UPDATE, or DELETE operations to update PostgreSQL. The steps are as follows:

  1. Read the source data in Databricks using a simple spark.sql query.
  2. Read the data from PostgreSQL using the JDBC driver.
  3. Perform a JOIN operation to identify differences.
  4. Collect the data using .collect() (I am now trying to use .toLocalIterator()).
  5. Chunk the data and iterate over it, executing DML operations using psycopg2 in batch (extras.execute_batch()), pushing a list of tuples with page_size=1000.
  6. …and that’s all.

Could the issue be that psycopg2 is not an API call from Databricks, so execution is handled by the driver? Or is the .collect() operation causing a bottleneck by bringing too much data to the driver at once?

jeremy98
Contributor III

any suggestion Mr. @Alberto_Umana ?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now