03-04-2025 05:53 AM
Hello community,
I was working on optimising the driver memory, since there are code that are not optimised for spark, and I was planning temporary to restart the cluster to free up the memory.
that could be a potential solution, since if the cluster is not working in the first few minutes of each hour it is a good moment to restart it and free up the memory. But I was looking about the standard output and seems that is not free any memory. Why this behaviour? I need to terminate and start the cluster instead of this previous operation?
03-04-2025 06:36 AM
Hi @jeremy98,
Generally, Databricks recommends regularly restarting clusters, particularly interactive ones, for regular clean-up. Restarting or terminating and starting the cluster anew ensures stopping all processes and freeing up memory effectively, therefore the restart should actually clean it up. You can see in your cluster metrics once it is restarted.
03-04-2025 06:47 AM - edited 03-04-2025 06:48 AM
Thanks Alberto, for the clarification! Yes, it is true, effectively, the metric UI doubled the logs for the driver and for the worker/s. I think it is a normal behaviour.
03-04-2025 06:53 AM
No problem, happy to assist!
03-04-2025 07:14 AM
Hi, I have another question: Usually, the driver should free memory by itself, but is it possible that the driver fails to do so? Why does this happen, and what issues can arise from this behavior?
03-04-2025 07:22 AM
Yes, it is actually possible that driver due to some reason did not free up memory.. if that happens the you will see these kind of failures:
03-04-2025 07:26 AM
Exactly, thanks, Alberto! But in general, is it best practice to restart a cluster every week to prevent this issue? Or does this problem happen because the code is not well-written?
03-04-2025 07:32 AM
It is best practice to restart the cluster regularly correct! Regularly restarting clusters can help mitigate memory leaks and accumulated GC issues.
And about if it happens because of your code, it depends on what you are doing and if you follow best practices, but would need more insights to tell.
03-05-2025 09:48 AM
Hi,
The code synchronizes Databricks with PostgreSQL by identifying differences and applying INSERT, UPDATE, or DELETE operations to update PostgreSQL. The steps are as follows:
Could the issue be that psycopg2 is not an API call from Databricks, so execution is handled by the driver? Or is the .collect() operation causing a bottleneck by bringing too much data to the driver at once?
03-07-2025 02:06 AM
any suggestion Mr. @Alberto_Umana ?
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now