Clean driver during notebook execution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-04-2023 12:23 AM
Is there any way to clear the memory driver during the execution of my notebook?
I have several functions that are executed in the driver and that generate in it different dataframes that are not necessary (these dataframes are created just to do some calculations).
I would like to know if it is possible to clean the driver in the middle of the execution of my notebook with some command in order to free it of memory so that it can continue storing data without performance problem.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-04-2023 05:44 AM
since spark uses lazy execution, those dataframes you do not need cannot be cleared unless you do use them (why define them otherwise?).
So when doing an action, spark will execute all code that is necessary. If you run into memory issues, you can do several things:
- see if you can write the code differently f.e. using the spark runtime instead of plain py functions
- use a larger driver
- persist dataframes (using checkpoints/cache)
Since spark executes code on the JVM, a garbage collect runs to scrap unecessary memory allocations.