Spark out of memory error.You can resolve this error by increasing the size of cluster in Databricks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-29-2022 12:08 PM
Spark out of memory error.
You can resolve this error by increasing the size of cluster in Databricks.
- Labels:
-
Error
-
Memory error
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-29-2022 01:11 PM
@S S every time cluster increase may not be good solution, based on scenarios it gets changed
- sometimes we may need to tweak code
- sometimes we may need to add memory parameters
- Based on ganglia metrics we can get more information
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-29-2022 08:00 PM
Hi guys,
I agree, it is better if you improve you code rather than increase the size of cluster. You can config the number of partitions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-29-2022 10:41 PM
Directly jumping on solution to inc the cluster size is not advisible. I found this nicely written blog what could be potential reason and some initial steps to resolve the OOM error in spark.
https://medium.com/swlh/spark-oom-error-closeup-462c7a01709d
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-30-2022 02:00 AM
Adding some more points to @karthik p 's answer.
- Use kryo serializer instead of java serializer.
- Use an optimised garbage collector such as G1GC.
- Use partitioning wisely on a field.

