cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Clean driver during notebook execution

SaraCorralLou
New Contributor III

Is there any way to clear the memory driver during the execution of my notebook? 

I have several functions that are executed in the driver and that generate in it different dataframes that are not necessary (these dataframes are created just to do some calculations).

I would like to know if it is possible to clean the driver in the middle of the execution of my notebook with some command in order to free it of memory so that it can continue storing data without performance problem.

Thanks!

1 REPLY 1

-werners-
Esteemed Contributor III

since spark uses lazy execution, those dataframes you do not need cannot be cleared unless you do use them (why define them otherwise?).
So when doing an action, spark will execute all code that is necessary.  If you run into memory issues, you can do several things:
- see if you can write the code differently f.e. using the spark runtime instead of plain py functions
- use a larger driver
- persist dataframes (using checkpoints/cache)
Since spark executes code on the JVM, a garbage collect runs to scrap unecessary memory allocations.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group