cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Knowledge Sharing Hub
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Understanding Coalesce, Skewed Joins, and Why AQE Doesn't Always Intervene

techgeorge
New Contributor II

In Spark, data skew can be the silent killer of performance. One wide partition pulling in 90% of the data?

But even with AQE (Adaptive Query Execution) turned on in Databricks, skewness isn't always automatically identifiedโ€” and hereโ€™s why.

What Is coalesce() in Spark?

The coalesce(n) function reduces the number of partitions in a DataFrame without a full shuffle, usually used to compact data after a wide transformation like a join or groupBy. Itโ€™s especially useful when:

  • You're writing output to disk (e.g., Parquet, Delta) and want fewer files.

  • You're post-processing skewed data and want to redistribute load more evenly.

But this can result to disproportionately large volume of data remained concentrated in a single partition, leading to severe data skew โ€” where one task handled the majority of the workload while others remained underutilized. 

Data Skew.png

 

Shouldnโ€™t AQE(Adaptive Query Execution) have caught this?

coalesce(n) operation does not trigger a full shuffle like repartition(n)There is therefore no signal to Catalyst for run-time optimizing to see if AQE could be applied - as there is no full shuffle to be detected, which serves as an optimization, precursor condition for invoking AQE.

Conclusion

AQE didnโ€™t helpโ€Šโ€”โ€Šnot because it failed, but because we never gave it the chance.

 

@techgeorge
1 REPLY 1

BigRoux
Databricks Employee
Databricks Employee

@mark_ott , this question seems right up your alley. Care to comment?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now