How to read and optimize Physical plans in Spark to optimize for TBs and PBs of data workflows
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-19-2026 09:07 PM
One of the Amazon interviews I attended, which was for a Big data engineer asked me for this particular skill of reading and understanding physical plans in spark to optimize MASSIVE dataloads. But I though spark automatically does all these optimizations on its own with respect to optimizing plans using Adaptive query execution. Am I missing something? If so, how do I address this? Great if you folks had experience on the same and could share me some best resources.
Thank you!