Greetings @praveenm00 , good question, and you're right that AQE handles a lot automatically. But understanding physical plans is still worth the investment, especially at TB/PB scale, because AQE works within constraints. It can't fix a bad query structure, misconfigured settings, or unnecessary shuffles baked into your data model. The plan tells you what Spark actually decided to do, which is where any real tuning starts.
How to read the plan:
Use EXPLAIN in SQL or .explain() on a DataFrame. The variants worth knowing:
- EXPLAIN EXTENDED -- shows parsed, analyzed, optimized, and physical plans
- EXPLAIN FORMATTED -- cleaner output, easier to navigate
- EXPLAIN CODEGEN -- generated code, useful for CPU-level tuning
Read bottom-up. Data flows from the leaf nodes (scans) upward through transformations to the final output.
What to look for at scale:
- Exchanges (shuffles) -- the most expensive operations. Every ShuffleExchange node is worth questioning. Can it be avoided with better partitioning or bucketing?
- Join strategy -- SortMergeJoin is common but expensive. AQE can promote to BroadcastHashJoin at runtime if one side is small enough, but you can also force it with spark.sql.autoBroadcastJoinThreshold.
- Scan-to-output ratio -- if you're scanning 10B rows and keeping 1M, you want those filters pushed down or your data repartitioned.
- Partition count -- too few means skew and OOM risk, too many means scheduling overhead. AQE coalesces at runtime, but the starting point still matters.
- Skew detection -- look for SortMergeJoin on high-cardinality keys and confirm that spark.sql.adaptive.skewJoin.enabled is actually triggering, not just enabled.
Resources worth checking:
- Databricks docs on AQE -- covers what it handles and what it doesn't
- The Spark UI SQL tab -- the visual DAG is much easier to navigate than raw EXPLAIN output
- "High Performance Spark" by Holden Karau -- still the best deep dive on this
The short version: AQE is a safety net, not a substitute for understanding what your query is doing. At scale, the difference between a 20-minute job and a 2-hour job often comes down to one bad exchange or a skew case that AQE didn't catch. That's exactly the skill that interviewer was testing for.
Hope this helps you, Louis.