mohammedkhu
New Contributor II

@boitumelodikoko I am facing the exact same issue but on all purpose compute. It works well for smaller dataset, but for large dataset it will fails with same error.

The dataset i am working on has 13M rows, and I have scaled upto n2-highmem-8 (same for worker and driver) (autoscaling 4-8), this hasnt helped either. I am thinking for trying another size up to see how it goes.

@NandiniN Unfortunately, neither cache(), persist() or localCheckpoint() or checkpoint() work, and all of them error out with same RETRIES_EXCEEDED error. I dont perform any joins persay just some pivot operations provided by discoverx library to scan all tables.

Any other suggestions you have? Or is scaling up cluster only option ?