cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Shuffle Partitions

payalbhatia
New Contributor II

What if I have lot of empty shuffled partitions due to data skewness 

Secondly , if the shuffle partition size is 128 MB and if the size of the key's partition is 700 MB 

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @payalbhatia

Empty Shuffled Partitions Due to Data Skewness

  1. Introduce a random โ€œsaltโ€ value to the keys to distribute the data more evenly across partitions.
  2. Implement a custom partitioner that distributes the data more evenly based on your specific data distribution.
  3. Use techniques like broadcasting the smaller dataset in a join operation to avoid skew.
  4. Analyze a sample of your data to understand the distribution and adjust your partitioning strategy accordingly.

Managing Large Shuffle Partitions

If your shuffle partition size is set to 128 MB but you have a key partition size of 700 MB, you might face performance issues. Here are some ways to handle this:

  1. Increase the number of shuffle partitions to reduce the size of each partition. You can do this by setting spark.sql.shuffle.partitions to a higher value.
  2. Enable AQE in Spark, which can dynamically optimize the number of shuffle partitions based on the runtime statistics.
  3. Explicitly repartition your data before the shuffle operation to ensure a more balanced distribution.
  4. Ensure that the input file sizes are optimized to match the shuffle partition size, reducing the need for large partitions.

Would you like more detailed guidance on any of these strategies?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group