12-10-2025 11:18 PM
What should you do when your dataset is uneven—some values appear too many times and others appear very few times—while working in Databricks?
12-11-2025 12:41 AM
Hi @Suheb ,
Refer to really good guide prepared by Databricks team. When you have a skewed dataset the primary things you can do are following:
1. Filter skewed values
2. Apply Skew hints
3. AQE skew optimization
4. Salting
Much detailed description of above terms can be found in below guide:
Comprehensive Guide to Optimize Data Workloads | Databricks
View solution in original post
never-displayed