GC Allocation Failure
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2025 04:48 PM
There are a couple of related posts here and here.
Seeing a similar issue with a long running job. Processes are in a "RUNNING" state, cluster is active, but stdout log shows the dreaded GC Allocation Failure.
Env:
I've set the following on the config:
.config("spark.cleaner.referenceTracking.cleanCheckpoints", "true")
.config("spark.cleaner.periodicGC.interval", "1min")
and have attempted to clear the cache:
spark.catalog.clearCache()
Is there anything else I can try? Is it possible to set up an alert for this error to kill the job when it enters this state so we aren't burning through resources?
Labels:
- Labels:
-
Spark
0 REPLIES 0

