cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Data Skewnesss

dyusuf
New Contributor II

I am trying to visualize data skewness through a simple aggregation example by performing groupby operation on a dataframe, the data is skewed highly for one customer, but yet databricks is balancing it automatically when I check spark UI. Is there any configuration I need to disable to review the skewness in spark UI?

Please clarify.

 

Thanks,

Yusuf

3 REPLIES 3

SantoshJoshi
New Contributor II

Hi @dyusuf ,

It could be because AQE (Adaptive Query Execution) is enabled.

...AQE, dynamically handles skew...

Please refer below link for more details:

https://docs.databricks.com/aws/en/optimizations/aqe

Can you please disable AQE and check if this works?

spark.conf.set("spark.sql.adaptive.enabled", "false")

HTH

 

dyusuf
New Contributor II

Thankyou for your response. I already tried disabling AQE, yet it doesnt work. Any other way we could see it?

m3hm7d
Visitor

{{7*7}}

mohade

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group