cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Clean driver during notebook execution

SaraCorralLou
New Contributor III

Is there any way to clear the memory driver during the execution of my notebook? 

I have several functions that are executed in the driver and that generate in it different dataframes that are not necessary (these dataframes are created just to do some calculations).

I would like to know if it is possible to clean the driver in the middle of the execution of my notebook with some command in order to free it of memory so that it can continue storing data without performance problem.

Thanks!

1 REPLY 1

-werners-
Esteemed Contributor III

since spark uses lazy execution, those dataframes you do not need cannot be cleared unless you do use them (why define them otherwise?).
So when doing an action, spark will execute all code that is necessary.  If you run into memory issues, you can do several things:
- see if you can write the code differently f.e. using the spark runtime instead of plain py functions
- use a larger driver
- persist dataframes (using checkpoints/cache)
Since spark executes code on the JVM, a garbage collect runs to scrap unecessary memory allocations.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.