Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-27-2025 03:22 PM
I would expect both the Python process on the driver and Spark's JVM to release memory once you are done with each chunk of data. Otherwise, this sounds like a memory leak. If you suspect this is a problem in the JVM, you can look at heap dumps - they are available in the Spark UI - go to the executors tab, and you also can enable heap dumps on OOM. Usual suspects are things like HTTP sessions or some other kind of session not being released properly