- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2025 08:33 PM
@boitumelodikoko I am facing the exact same issue but on all purpose compute. It works well for smaller dataset, but for large dataset it will fails with same error.
The dataset i am working on has 13M rows, and I have scaled upto n2-highmem-8 (same for worker and driver) (autoscaling 4-8), this hasnt helped either. I am thinking for trying another size up to see how it goes.
@NandiniN Unfortunately, neither cache(), persist() or localCheckpoint() or checkpoint() work, and all of them error out with same RETRIES_EXCEEDED error. I dont perform any joins persay just some pivot operations provided by discoverx library to scan all tables.
Any other suggestions you have? Or is scaling up cluster only option ?